this post was submitted on 04 Feb 2025
4 points (66.7% liked)

Technology

1756 readers
450 users here now

Which posts fit here?

Anything that is at least tangentially connected to the technology, social media platforms, informational technologies and tech policy.


Rules

1. English onlyTitle and associated content has to be in English.
2. Use original linkPost URL should be the original link to the article (even if paywalled) and archived copies left in the body. It allows avoiding duplicate posts when cross-posting.
3. Respectful communicationAll communication has to be respectful of differing opinions, viewpoints, and experiences.
4. InclusivityEveryone is welcome here regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
5. Ad hominem attacksAny kind of personal attacks are expressly forbidden. If you can't argue your position without attacking a person's character, you already lost the argument.
6. Off-topic tangentsStay on topic. Keep it relevant.
7. Instance rules may applyIf something is not covered by community rules, but are against lemmy.zip instance rules, they will be enforced.


Companion communities

[email protected]
[email protected]


Icon attribution | Banner attribution


If someone is interested in moderating this community, message @[email protected].

founded 1 year ago
MODERATORS
 

Archived version

Skepticism Around DeepSeek’s Claims

DeepSeek’s assertions about its advancements have drawn significant attention, but much of it remains unverified. For a technology that allegedly leapfrogs existing capabilities, the specifics around its breakthroughs are conspicuously lacking. Transparency has always been a cornerstone for evaluating cutting-edge technologies, and until DeepSeek provides more concrete evidence, skepticism is not just warranted—it’s necessary.

Chips on the Table: Do They Have More Than We Think?

One of the most puzzling aspects of the DeepSeek story is the apparent discrepancy between the resources they claim to have and those they might actually possess. Analysts are increasingly suspicious that DeepSeek may have access to far more hardware—particularly high-performance chips—than has been publicly disclosed. If true, this could have significant implications for their capacity to train and deploy their models at scale, raising questions about how they’ve managed to secure such resources.

The Training Puzzle: Costs and Methodology

Another critical angle here is the cost and methodology behind training their purportedly groundbreaking model. Training large language models (LLMs) is notoriously expensive and resource-intensive, often running into tens or even hundreds of millions of dollars. How did DeepSeek manage to foot this bill, especially given their previously disclosed financials? Additionally, there’s an elephant in the room: Did they rely on other LLMs during training? This would raise ethical and competitive concerns, as it has long been recognized as a controversial practice in the AI community. Leveraging other providers’ models for training—potentially without permission—distorts fair competition and undermines trust in the ecosystem.

Overreaction vs. Reality

The broader market response underscores the dangers of overreaction. While innovation in AI tools is undeniably exciting, we’ve seen time and again how unverified claims can lead to speculative bubbles. For investors, this is a moment to pause, ask questions, and demand clarity before assigning sky-high valuations to unproven technologies.

In summary, while Deepseek’s story is intriguing, it’s imperative to separate fact from speculation. The market needs to temper its enthusiasm and demand more transparency before awarding DeepSeek the crown of AI innovation. Until then, skepticism remains a healthy and necessary stance.

top 7 comments
sorted by: hot top controversial new old
[–] [email protected] 10 points 22 hours ago

This article was written and sponsored by ChatGPT.

[–] [email protected] 3 points 17 hours ago* (last edited 17 hours ago)

Don't think this holds true. Nvidia still has a market capitalization of almost 3 trillion dollars?! If this was some overcorrection, the AI companies and suppliers would compare to regular companies by now. But they're still worth like more than all the traditional car manufacturers added together. So it's still a hugely overinflated bubble and nothing has popped yet. Or being overcorrected.

[–] [email protected] 4 points 22 hours ago

The hype for language model based "ai" was a bigger overreaction.

[–] [email protected] 3 points 21 hours ago

I'm sure his opinion is no way influenced by his own company's goals.

Dan Goman is the founder and CEO of Ateliere Creative Technologies, a pioneering cloud-native media supply chain company that enables media enterprises and content creators to reach audiences globally. Recognizing the imminent shift from traditional broadcast to digital, Dan foresaw the industry's necessity to transition its entire media supply chain to the cloud to remain competitive in a digital-first era dominated by technology companies.

He led a team of highly skilled software experts to develop a SaaS-based media supply chain platform, designed for turnkey implementation and operational within days, without requiring substantial capital expenditure. This initiative resulted in Ateliere Connect, a platform that manages the media supply chain from concept to consumer, significantly reducing costs for content owners and enhancing global content monetization. With Ateliere Connect, content owners can now monetize their content on any platform with a single click, as the intelligent automation and seamless integration with any global content platform eliminate traditional cost and time barriers.

Under Dan's leadership, Ateliere continues to push the envelope with its latest Gen AI-driven media supply chain platform. The new generation of Ateliere Connect utilizes sophisticated, continuously learning Gen AI engines to address industry challenges. Notably, the platform can now analyze consumer content demand and suggest optimal monetization strategies to content owners. It also automates the programming and distribution of fast channels based on consumer data. The platform's foundation already incorporates AI through its proprietary FrameDNA™ technology, used by industry giants like Lionsgate, MGM, the World Poker Tour® and other large content studios. This technology significantly reduces AWS storage costs by identifying and eliminating redundant content, offering substantial cost savings and operational efficiencies.

Dan has a strong software background, having held various software-related roles at companies such as Microsoft, Lucent Technologies, AT&T Wireless, and Computer Associates.


DeepSeek’s assertions about its advancements have drawn significant attention <...>

Only from rival companies which 10 out of 11 top models are from US based companies.

Transparency has always been a cornerstone for evaluating cutting-edge technologies, and until DeepSeek provides more concrete evidence, skepticism is not just warranted—it’s necessary.

That is the most straight-faced lied ever told... Literally all cutting-edge tech is hidden behind NDA's or trade secret classifications. US own LLM models are mostly closed source, with even weights not being available. There is literally lawsuits in the US about lack of transparency on the data used in training the models.

One of the most puzzling aspects of the DeepSeek story is the apparent discrepancy between the resources they claim to have and those they might actually possess.

Which is irrelevant in context of model capabilities. It's already trained and that's the competition now. Which article like these really shows US based companies are incapable of going against.

How did DeepSeek manage to foot this bill, especially given their previously disclosed financials?

Again, irrelevant in context of model capabilities.

Additionally, there’s an elephant in the room: Did they rely on other LLMs during training? This would raise ethical and competitive concerns, as it has long been recognized as a controversial practice in the AI community. Leveraging other providers’ models for training—potentially without permission—distorts fair competition and undermines trust in the ecosystem.

Fair and LLM is an oxymoron. But also US wouldn't know fair if it bit them in the ass. There companies are literally begging US government to block or limit high-end hardware to being sent to even their allied countries... That's sounds "fair", right?

While innovation in AI tools is undeniably exciting, we’ve seen time and again how unverified claims can lead to speculative bubbles.

Yes, the whole "AI" is a speculative bubble. You just want to be able to milk it yourself before it bursts. And DeepSeek news made all that free cash to dry up.


Please stop making me defend LLMs from these paid hit pieces.

[–] starman2112 2 points 20 hours ago

All I know is that the US government wants to ban us from using it, so it's got my support

[–] [email protected] 3 points 22 hours ago

News orgs have gotten rather great at building hype trains, justified or not

[–] [email protected] 2 points 22 hours ago

Speculative execution is not a new idea. Speculative training is just a new application.We tend to throw money at problems because we have it. They seem to have sufficient motivation to work more efficiently through sanctions.