this post was submitted on 21 Oct 2024
931 points (97.3% liked)
Technology
59675 readers
3243 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
https://www.microsoft.com/en-us/research/publication/bitnet-scaling-1-bit-transformers-for-large-language-models/ use 1 bit instead of 8 or 16, yay performance gainz
Jevon's paradox says that efficiency will increase use. Microsoft is buying a nuclear power plant for AI shit. If they can train a trillion-parameter model for one-thousandth the cost... they will instead train a quadrillion-parameter model.
Or I guess if they're smart they'll train a trillion-parameter model longer. Or iterate like crazy, when training takes hours instead of months.
So will the return of the flag conclude the adventures of ressource usage in computers?