I know people are gonna freak out about the AI part in this.
But as a person with hearing difficulties this would be revolutionary. So much shit I usually just can’t watch because open subtitles doesn’t have any subtitles for it.
The most important part is that it’s a local
LLMmodel running on your machine. The problem with AI is less about LLMs themselves, and more about their control and application by unethical companies and governments in a world driven by profit and power. And it’s none of those things, it’s just some open source code running on your device. So that’s cool and good.Indeed, YouTube had auto generated subtitles for a while now and they are far from perfect, yet I still find it useful.
I agree that this is a nice thing, just gotta point out that there are several other good websites for subtitles. Here are the ones I use frequently:
https://subdl.com/
https://www.podnapisi.net/
https://www.subf2m.co/And if you didn’t know, there are two opensubtitles websites:
https://www.opensubtitles.com/
https://www.opensubtitles.org/Not sure if the .com one is supposed to be a more modern frontend for the .org or something but I’ve found different subtitles on them so it’s good to use both.
Et tu, Brute?
VLC automatic subtitles generation and translation based on local and open source AI models running on your machine working offline, and supporting numerous languages!
Oh, so it’s basically like YouTube’s auto-generatedd subtitles. Never mind.
Hopefully better than YouTube’s, those are often pretty bad, especially for non-English videos.
They’re awful for English videos too, IMO. Anyone with any kind of accent(read literally anyone except those with similar accents to the team that developed the auto-caption) it makes egregious errors, it’s exceptionally bad with Australian, New Zealand, English, Irish, Scottish, Southern US, and North Eastern US. I’m my experience “using” it i find it nigh unusable.
Try it with videos featuring Kevin Bridges, Frankie Boyle, or Johnny Vegas
They are terrible.
I’ve been working on something similar-ish on and off.
There are three (good) solutions involving open-source models that I came across:
- KenLM/STT
- DeepSpeech
- Vosk
Vosk has the best models. But they are large. You can’t use the gigaspeech model for example (which is useful even with non-US english) to live-generate subs on many devices, because of the memory requirements. So my guess would be, whatever VLC will provide will probably suck to an extent, because it will have to be fast/lightweight enough.
What also sets vosk-api apart is that you can ask it to provide multiple alternatives (10 is usually used).
One core idea in my tool is to combine all alternatives into one text. So suppose the model predicts text to be either “… still he …” or “… silly …”. My tool can give you “… (still he|silly) …” instead of 50/50 chancing it.
I love that approach you’re taking! So many times, even in shows with official subs, they’re wrong because of homonyms and I’d really appreciate a hedged transcript.
That would depend on the LLM and the data used to train it.
All hail the peak humanity levels of VLC devs.
FOSS FTW
accessibility is honestly the first good use of ai. i hope they can find a way to make them better than youtube’s automatic captions though.
I know Jeff Geerling on Youtube uses OpenAIs Whisper to generate captions for his videos instead of relying on Youtube’s. Apparently they are much better than Youtube’s being nearly flawless. I would have a guess that Google wants to minimize the compute that they use when processing videos to save money.
While LLMs are truly impressive feats of engineering, it’s really annoying to witness the tech hype train once again.
Spoiler: they won’t
I know AI has some PR issues at the moment but I can’t see how this could possibly be interpreted as a net negative here.
In most cases, people will go for (manually) written subtitles rather than autogenerated ones, so the use case here would most often be in cases where there isn’t a better, human-created subbing available.
I just can’t see AI / autogenerated subtitles of any kind taking jobs from humans because they will always be worse/less accurate in some way.
Autogenerated subtitles are pretty awesome for subtitle editors I’d imagine.
Yeah this is exactly what we should want from AI. Filling in an immediate need, but also recognizing it won’t be as good as a pro translation.
I believe it’s limited in scope to speech recognition at this stage but hey ho
Solving problems related to accessibility is a worthy goal.
And yet they still can’t seek backwards
Iirc this is because of how they’ve optimized the file reading process; it genuinely might be more work to add efficient frame-by-frame backwards seeking than this AI subtitle feature.
That said, jfc please just add backwards seeking. It is so painful to use VLC for reviewing footage. I don’t care how “inefficient” it is, my computer can handle any operation on a 100mb file.
If you have time to read the issue thread about it, it’s infuriating. There are multiple viable suggestions that are dismissed because they don’t work in certain edge cases where it would be impossible for any method at all to work, and which they could simply fail gracefully for.
That kind of attitude in development drives me absolutely insane. See also: support for DHCPv6 in Android. There’s a thread that has been raging for I think over a decade now
Same for simply allowing to pause on click… Luckily extension exists but it’s sad that you need one.
I now know more about Android IPv6 than ever before
You can easily write a video reader using openCV that would be able to read backward using cache
I was part of that!
Thank you for your service
I am still waiting for seek previews
MPC-BE
this is great news.
Still no live audio encoding without CLI (unless you stream to yourself), so no plug and play with Dolby/DTS
Encoding params still max out at 512 kpbs on every codec without CLI.
Can’t switch audio backends live (minor inconvenience, tbh)
Creates a barely usable non standard M3A format when saving a playlist.
I think that’s about my only complaints for VLC. The default subtitles are solid, especially with multiple text boxes for signs. Playback has been solid for ages. Handles lots of tracks well, and doesn’t just wrap ffmpeg so it’s very useful for testing or debugging your setup against mplayer or mpv.
I’ve been waiting for
thisbreak-free playback for a long time. Just play Dark Side of the Moon without breaks in between tracks. Surely a single thread could look ahead and see the next track doesn’t need any different codecs launched, it’s technically identical to the current track, there’s no need to have a break. /rantThis is great timing considering the recent Open Subtitles fiasco.
Huh?
Open Subtitles now only allows 5 downloads per 24 hours per IP. You have to pay for more.
Oof. Well, they have to make money somehow. And probably there were people abusing the site. It wouldn’t surprise me for example if many did not cache the subtitles but had them on demand for videos.
Use https://opensubtitles.com/, rather than https://opensubtitles.org/.
Now if only I could get it to play nice with my Chromecast… But I’m sure that’s on Google.
I’m fine watching porn without subtitles
Why are you using VLC for porn? You download porn?!
my state banned pornhub so I made a big ass stash just in case, so yeah I guess. I also have a stash of music from YouTube in case they ever fully block YT-DLP, so I’m just a general data hoarder.
Land of the Free!