Technology

57453 readers

4610 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

Microsoft CTO Kevin Scott thinks LLM “scaling laws” will hold despite criticism (arstechnica.com)

submitted 1 month ago by [email protected] to c/[email protected]

10 comments fedilink hide all child comments

top 10 comments

sorted by: hot top controversial new old

[–] [email protected] 48 points 1 month ago (3 children)

Microsoft CTO Kevin Scott is of course not a reliable source due to conflict of interest and his position in the US corporate world.

If anything, the fact that he is doing damage control PR around "LLM scaling laws" suggests something is amiss. Let's see how things develop.

[–] ItsComplicated 32 points 1 month ago

Given Microsoft's investment in OpenAI and strong marketing of its own Microsoft Copilot AI features, the company has a strong interest in maintaining the perception of continued progress, even if the tech stalls.

I believe this sums it up.

[–] [email protected] 2 points 1 month ago* (last edited 1 month ago)

While I agree about the conflict of interest, I would largely say the same thing despite no such conflict of interest. However I see intelligence as a modular and many dimensional concept. If it scales as anticipated, it will still need to be organized into different forms of informational or computational flow for anything resembling an actively intelligent system.

On that note, the recent developments with active inference like RXinfer are astonishing given the current level of attention being paid. Seeing how llms are being treated, I'm almost glad it's not being absorbed into the hype and hate cycle.

[–] [email protected] 2 points 1 month ago* (last edited 1 month ago)

Yeah. There's a very narrow corner that demands huge models, and that's use cases where there's no room for mistakes. That space is exciting, but also deeply bogged down in uncertainty, due both to laws and as-yet-undelivered, but 100% certainly coming-soon, law-creating-disasters.

Everywhere else, I suspect we've seen as good as we're going to get, from current generation AI.

Tech firm CEOs know this too, but there's not much interesting on the table to "bet the farm" on to court "swing for the fences" investors (gullible suckers) right now.

[–] conciselyverbose 10 points 1 month ago (1 children)

TLDR: he thinks the techniques are fine and you can just brute force them for the foreseeable future.

[–] [email protected] 1 points 1 month ago (2 children)

Yeah... Why does it sound dumb in so many level?

[–] Voroxpete 4 points 1 month ago (1 children)

Because he's a salesman, and he's selling you bullshit.

What the experts are now saying is that it looks like the LLM approach to AI will require exponentially larger amounts of training data (and data processing) to achieve linear growth. Next generation AI models will cost ten times as much to train, and the generation after that will cost ten times as much again.

The whole thing is a giant con. Kevin is just trying to keep investor confidence floating for a little longer.

[–] sugar_in_your_tea 1 points 1 month ago

And the harder the sell, the worse the product.

[–] conciselyverbose 1 points 1 month ago

lol I honestly needed to open the article to parse the title. That's why I posted.

But I'm definitely of the belief that you need a hell of a lot more architecture than they have to go meaningfully further. Humans are a hell of a lot more complicated than a bit like of neurons.

[–] [email protected] 3 points 1 month ago

This is the best summary I could come up with:

"And I try to help people understand there is an exponential here, and the unfortunate thing is you only get to sample it every couple of years because it just takes a while to build supercomputers and then train models on top of them."

The laws suggest that simply scaling up model size and training data can lead to significant improvements in AI capabilities without necessarily requiring fundamental algorithmic breakthroughs.

The perception has been fueled by largely informal observations—and some benchmark results—about recent models like Google's Gemini 1.5 Pro, Anthropic's Claude Opus, and even OpenAI's GPT-4o, which some argue haven't shown the dramatic leaps in capability seen in earlier generations, and that LLM development may be approaching diminishing returns.

Scott's stance suggests that tech giants like Microsoft still feel justified in investing heavily in larger AI models, betting on continued breakthroughs rather than hitting a capability plateau.

Some perceptions of slowing progress in LLM capabilities and benchmarking may be due to the rapid onset of AI in the public eye when, in fact, LLMs have been developing for years prior.

In the podcast interview, the Microsoft CTO pushed back against the idea that AI progress has stalled, but he acknowledged the challenge of infrequent data points in this field, as new models often take years to develop.

The original article contains 697 words, the summary contains 217 words. Saved 69%. I'm a bot and I'm open source!