• FundMECFSResearch@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    130
    arrow-down
    2
    ·
    22 days ago

    I know people are gonna freak out about the AI part in this.

    But as a person with hearing difficulties this would be revolutionary. So much shit I usually just can’t watch because open subtitles doesn’t have any subtitles for it.

    • kautau@lemmy.world
      link
      fedilink
      arrow-up
      76
      ·
      edit-2
      22 days ago

      The most important part is that it’s a local LLM model running on your machine. The problem with AI is less about LLMs themselves, and more about their control and application by unethical companies and governments in a world driven by profit and power. And it’s none of those things, it’s just some open source code running on your device. So that’s cool and good.

    • hushable@lemmy.world
      link
      fedilink
      arrow-up
      12
      arrow-down
      1
      ·
      22 days ago

      Indeed, YouTube had auto generated subtitles for a while now and they are far from perfect, yet I still find it useful.

    • M137@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      21 days ago

      I agree that this is a nice thing, just gotta point out that there are several other good websites for subtitles. Here are the ones I use frequently:

      https://subdl.com/
      https://www.podnapisi.net/
      https://www.subf2m.co/

      And if you didn’t know, there are two opensubtitles websites:
      https://www.opensubtitles.com/
      https://www.opensubtitles.org/

      Not sure if the .com one is supposed to be a more modern frontend for the .org or something but I’ve found different subtitles on them so it’s good to use both.

  • TheImpressiveX@lemm.ee
    link
    fedilink
    English
    arrow-up
    68
    arrow-down
    5
    ·
    22 days ago

    Et tu, Brute?

    VLC automatic subtitles generation and translation based on local and open source AI models running on your machine working offline, and supporting numerous languages!

    Oh, so it’s basically like YouTube’s auto-generatedd subtitles. Never mind.

    • Nemeski@lemm.eeOP
      link
      fedilink
      arrow-up
      47
      ·
      22 days ago

      Hopefully better than YouTube’s, those are often pretty bad, especially for non-English videos.

      • wazzupdog (they/them)@lemmy.blahaj.zone
        link
        fedilink
        arrow-up
        16
        ·
        22 days ago

        They’re awful for English videos too, IMO. Anyone with any kind of accent(read literally anyone except those with similar accents to the team that developed the auto-caption) it makes egregious errors, it’s exceptionally bad with Australian, New Zealand, English, Irish, Scottish, Southern US, and North Eastern US. I’m my experience “using” it i find it nigh unusable.

      • MoSal@lemm.ee
        link
        fedilink
        arrow-up
        6
        ·
        22 days ago

        I’ve been working on something similar-ish on and off.

        There are three (good) solutions involving open-source models that I came across:

        • KenLM/STT
        • DeepSpeech
        • Vosk

        Vosk has the best models. But they are large. You can’t use the gigaspeech model for example (which is useful even with non-US english) to live-generate subs on many devices, because of the memory requirements. So my guess would be, whatever VLC will provide will probably suck to an extent, because it will have to be fast/lightweight enough.

        What also sets vosk-api apart is that you can ask it to provide multiple alternatives (10 is usually used).

        One core idea in my tool is to combine all alternatives into one text. So suppose the model predicts text to be either “… still he …” or “… silly …”. My tool can give you “… (still he|silly) …” instead of 50/50 chancing it.

        • fartsparkles@sh.itjust.works
          link
          fedilink
          arrow-up
          5
          ·
          22 days ago

          I love that approach you’re taking! So many times, even in shows with official subs, they’re wrong because of homonyms and I’d really appreciate a hedged transcript.

  • katy ✨@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    43
    arrow-down
    1
    ·
    21 days ago

    accessibility is honestly the first good use of ai. i hope they can find a way to make them better than youtube’s automatic captions though.

    • yonder@sh.itjust.works
      link
      fedilink
      arrow-up
      8
      ·
      21 days ago

      I know Jeff Geerling on Youtube uses OpenAIs Whisper to generate captions for his videos instead of relying on Youtube’s. Apparently they are much better than Youtube’s being nearly flawless. I would have a guess that Google wants to minimize the compute that they use when processing videos to save money.

    • hector@sh.itjust.works
      link
      fedilink
      arrow-up
      6
      ·
      21 days ago

      While LLMs are truly impressive feats of engineering, it’s really annoying to witness the tech hype train once again.

  • pastaPersona@lemmy.world
    link
    fedilink
    arrow-up
    36
    arrow-down
    4
    ·
    22 days ago

    I know AI has some PR issues at the moment but I can’t see how this could possibly be interpreted as a net negative here.

    In most cases, people will go for (manually) written subtitles rather than autogenerated ones, so the use case here would most often be in cases where there isn’t a better, human-created subbing available.

    I just can’t see AI / autogenerated subtitles of any kind taking jobs from humans because they will always be worse/less accurate in some way.

    • x00z@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      ·
      22 days ago

      Autogenerated subtitles are pretty awesome for subtitle editors I’d imagine.

    • ArgentRaven@lemmy.world
      link
      fedilink
      arrow-up
      10
      arrow-down
      1
      ·
      22 days ago

      Yeah this is exactly what we should want from AI. Filling in an immediate need, but also recognizing it won’t be as good as a pro translation.

    • OsrsNeedsF2P@lemmy.ml
      link
      fedilink
      arrow-up
      19
      ·
      22 days ago

      Iirc this is because of how they’ve optimized the file reading process; it genuinely might be more work to add efficient frame-by-frame backwards seeking than this AI subtitle feature.

      That said, jfc please just add backwards seeking. It is so painful to use VLC for reviewing footage. I don’t care how “inefficient” it is, my computer can handle any operation on a 100mb file.

      • Feathercrown@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        21 days ago

        If you have time to read the issue thread about it, it’s infuriating. There are multiple viable suggestions that are dismissed because they don’t work in certain edge cases where it would be impossible for any method at all to work, and which they could simply fail gracefully for.

        • stevestevesteve@lemmy.world
          link
          fedilink
          arrow-up
          4
          ·
          21 days ago

          That kind of attitude in development drives me absolutely insane. See also: support for DHCPv6 in Android. There’s a thread that has been raging for I think over a decade now

  • mlg@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    21 days ago

    Still no live audio encoding without CLI (unless you stream to yourself), so no plug and play with Dolby/DTS

    Encoding params still max out at 512 kpbs on every codec without CLI.

    Can’t switch audio backends live (minor inconvenience, tbh)

    Creates a barely usable non standard M3A format when saving a playlist.

    I think that’s about my only complaints for VLC. The default subtitles are solid, especially with multiple text boxes for signs. Playback has been solid for ages. Handles lots of tracks well, and doesn’t just wrap ffmpeg so it’s very useful for testing or debugging your setup against mplayer or mpv.

  • Not a replicant@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    21 days ago

    I’ve been waiting for this break-free playback for a long time. Just play Dark Side of the Moon without breaks in between tracks. Surely a single thread could look ahead and see the next track doesn’t need any different codecs launched, it’s technically identical to the current track, there’s no need to have a break. /rant