this post was submitted on 13 May 2025
370 points (95.3% liked)

Technology

70162 readers
3446 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 6 points 6 days ago* (last edited 6 days ago) (15 children)

This is clearly the future despite the outrage here.

There are at least 389 living languages with over 1M speakers. That alone means it's impossible to reach some people and they get left out. Most of these languages dont even have enough professional voice actors to cover the bandwidth.

There are thousands of books released every year. That's impossible to cover even in English alone.

Its an objective net good to have more accessible audio books and the privileged people who do care about this stuff can very much afford to vote with their wallets for non-ai voices.

In fact since AI moat is so minimal this will very quickly be adapted by open source solution providing audio book access to millions if not billions of people to whom this was not an option. Its amazing.

[–] taladar 14 points 6 days ago (10 children)

Most of these languages dont even have enough professional voice actors to cover the bandwidth.

And you think anyone is training AI voice models for those languages? Have you even seen how long it takes even large companies like Google to support the languages with hundreds of millions of speakers?

[–] [email protected] 1 points 6 days ago* (last edited 6 days ago) (1 children)

That's the benefit of using AI and machine learning - once you have enough source material, you can throw it all in and it'll eventually spit out a model.
Which is exactly what Meta did with their Massively Multilingual Speech project which supports text-to-speech and speech-to-text for 1107 different languages.

Is it actually any good in 99% of them, I don't have a clue, but it exists.

[–] taladar 1 points 6 days ago

Seems more like a proof of concept project for that paper than something they are pursuing seriously judging by the GitHub location in some example folder that hasn't seen any significant updates in over a year. If it is so great I would assume they would pursue it more actively and replace existing models with it two years later.

load more comments (8 replies)
load more comments (12 replies)