To reference a previous sidenote, DeepSeek gives corps and randos a means to shove an LLM into their shit for dirt-cheap, so I expect they're gonna blow up in popularity.
TechTakes
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
They finetuned 1.5-3b models. This is a non-story
Yup and it's not even testing general reasoning. They didn't have money for that.
Non-techie requesting a laymen explanation if anyone has time!
After reading a couple of”what makes nvidias h100 chips so special” articles I’m gathering that they were supposed to have a significant amount more computational capability than their competitors (which I’m taking to mean more computations per second). So the question with deepseek and similar is something like ‘how are they able to get the same results with less computations?’ and the answer is speculated to be more efficient code/instructions for the AI model so it can make the same conclusions with less computations overall, potentially reducing the need for special jacked up cpus to run it?
From a technical POV, from having read into it a little:
Deepseek devs worked in a very low level language called Assembly. This language is unlike relatively newer languages like C in that it provides no guardrails at all and is basically CPU instructions in extreme shorthand. An "if" statement would be something like BEQ 1000, where it goes to a specific memory location(in this case address 1000 if two CPU registers are equal.)
The advantage of using it is that it is considerably faster than C. However, it also means that the code is mostly locked to that specific hardware. If you add more memory or change CPUs you have to refactor. This is one of the reasons the language was largely replaced with C and other languages.
Edit: to expound on this: "modern" languages are even slower, but more flexible in terms of hardware. This would be languages like Python, Java and C#
I’m sure that non techie person understood every word of this.
And I'm sure that your snide remark will both tell them what to simplify and explain how to do so.
Enjoy your free trip to the egress.