2
'GPUs still rule' asserts graphics guru Raja Koduri in response to custom AI silicon advocate
(www.tomshardware.com)
A place for quality hardware news, reviews, and intelligent discussion.
Jensen Huang said it best: a GPU is the perfect balance between being so specialized that it isn't worthwhile, and so general that it becomes just another CPU. And they do have custom silicon when necessary, like the tensor cores but again that's in addition to, not a replacement of the existing hardware. Considering the hundreds of AI accelerator startups (a few of which have already failed), he's right.
He's only right in the short term when the technology isn't stable and the AI software architectures are constantly changing.
Once things stabilize, we're most likely switching to either analog compute in memory or silicon photonics both of which will be far less generic than a GPU, but with such a massive power, performance, and cost advantage that GPUs simply cannot compete.
What does that word salad have to do with AI? ;-)
First-up, here's a Veritasium breakdown of why a lot of next-gen AI is leaning into analog computing to save space and power while increasing total numbers of computations per second.
https://www.youtube.com/watch?v=GVsUOuSjvcg
The unreliability of analog makes it unsuited for the deterministic algorithms we normally run on computers, but doesn't have large negative effects on AI algorithms because of their low fidelity nature (and for some algorithms, getting some free entropy is actually a feature rather than a bug).
Here's an Asianometry breakdown of silicon photonics
https://www.youtube.com/watch?v=t0yj4hBDUsc
Silicon Photonics is the use of light between transistors. It's been in research for decades and is already seeing limited use in some networking applications. IBM in particular has been researching this for a very long time in hopes of solving some chip communication issues, but there are a lot of technical issues to solve to put billions of these things in a CPU.
AI changed the equation because it allows analog compute. A multiply generally takes 4-5 cycles with each cycle doing a bunch of shift then add operations in series. With silicon photonics, this is as simple as turning on two emitters, merging the light, then recording the output. If you want to multiply 10 numbers together, you can do it in ONE cycle instead of 40-50 on a normal chip (not including all the setup instructions likely needed by that normal multiplier circuit).
Here's a quick IBM explainer on in-memory compute.
https://www.youtube.com/watch?v=BTnr8z-ePR4
Basically, it takes several times more energy to move two numbers into a CPU than it does to add them together. Ohm's law allows us to do analog multiplication by connecting various resistors and measuring the output.
You can use this to do calculations and the beauty is that your data hardly has to travel at all and you were already having to use energy to refresh it frequently anyway. The total clockspeed is far lower due to physics limitations of capacitors, but if you can be calculating every single cell of a multi-terabyte matrix at the same time, that really doesn't matter as your total compute power will be massively faster in aggregate AND use several times less power.
Of course, all these analog alternatives have absolutely nothing in common with modern GPUs, but simple operations are massively more power efficient with in-memory compute and complex operations are massively more power efficient with silicon photonics.