Opensource

2408 readers

100 users here now

A community for discussion about open source software! Ask questions, share knowledge, share news, or post interesting stuff related to it!

Credits

Icon base by Lorc under CC BY 3.0 with modifications to add a gradient

⠀

founded 2 years ago

MODERATORS

[email protected]

482

VLC tops 6 billion downloads, previews AI-generated subtitles (techcrunch.com)

submitted 2 months ago by [email protected] to c/[email protected]

115 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 13 points 2 months ago (2 children)

I don't mind the idea, but I would be curious where the training data comes from. You can't just train them off of the user's (unsubtitled) videos, because you need subtitles to know if the output is right or wrong. I checked their twitter post, but it didn't seem to help.

[–] [email protected] 17 points 2 months ago (1 children)

subtitles aren't a unique dataset it's just audio to text

[–] [email protected] 11 points 2 months ago (1 children)

They may have to give it some special training to be able to understand audio mixed by the Chris Nolan school of wtf are they saying.

[–] [email protected] 3 points 2 months ago (1 children)

No, if you have a center track you can just use that. Volume isn't a problem for a computer listening to it since they don't use the physical speakers.

[–] [email protected] 1 points 2 months ago

I took the other comment as a joke but this is accurate and interesting additional information!

[–] [email protected] 8 points 2 months ago

I hope they're using Open Subtitles, or one of the many academic Speech To Text datasets that exist.