to echo others, it's not what, but how.
cpus do execution reordering and speculation to run one thread really fast. gpus have mostly avoided that and execute threads in large groups called "warps" (analogous to lanes of a SIMD unit).
to echo others, it's not what, but how.
cpus do execution reordering and speculation to run one thread really fast. gpus have mostly avoided that and execute threads in large groups called "warps" (analogous to lanes of a SIMD unit).
this has been my take, it's an obvious case of the 80-20 rule. During the times of breakthrough/flux, NVIDIA benefits from having both the research community onboard as well as a full set of functionality and great tooling etc. when things slow back down you'll see google come out with a new TPU and amazon will have a new graviton etc.
it's not that hard in principle to staple an accelerator to an ARM core, actually that's kind of a major marketing point for ARM. And nowadays you'd want an interconnect too. There are a decently large number of companies who can sustain such a thing at reasonably market-competitive prices. So once the market settles, the margins will decline.
On the other hand, if you are building large, training-focused accelerators etc... it is also going to be a case of convergent evolution. In the abstract, we are talking about massively parallel accelerator units with some large memory subsystem to keep them fed, and some type of local command processor to handle the low-level scheduling and latency-hiding. Which, gosh, sounds like a GPGPU.
If you are giving it any degree of general programmability then it just starts to look very much like a GPU. If you aren't, then you risk falling off the innovation curve the next time someone has a clever idea, just like previous generations of "ASICs". And you are doing your tooling and infrastructure and debugging all from scratch too, with much less support and resources. GPGPU is turnkey at this stage, do you want your engineers building CUDA or do you want them building your product?
that's what I said, the memory bandwidth is already baked into the numbers you see. the cache increases mean that you don't need as much actual memory bandwidth - it's the same thing AMD did with RDNA2.
AMD reduced the memory bus by 25% on the 6700XT relative to its predecessor and 33% on the 6600XT relative to its predecessor, so, if you think that will cause those cards to age more poorly...
4060 Ti is objectively a bad product.
3060 Ti is literally better.
it literally is not, 4060 Ti is 11% faster at 1080p, 9% faster at 1440p, and 6% faster at 2160p.
the reduced memory bandwidth is already baked into these performance figures, and apart from some edge-cases like emulated PS3/wii at 16K resolution the 4060 Ti is still generally a faster card. not that much faster, but, it's not slower either.
Also no idea why Steve said the 6800XT is faster than the 7800XT? Its clearly not.
i'd prefer to look at the meta-reviews rather than any one reviewer or any one set of games, but, yea, you're right, 7800XT is ~5% faster than 6800XT, it is factually incorrect to say it's slower.
I don't know why everyone seems to have collectively decided that it's slower, same for the 4060 and 4060 Ti which are both faster than the 3060 and 3060 Ti (respectively) at relevant resolutions. Maybe not as much faster as people would like to see, but they literally perform faster in spite of the memory bandwidth reductions etc.
ironically COD did the "cold war gone hot" thing not too long ago, lol
I actually think smaller-scale conflicts would be a good fit for battlefield gameplay. the series has eternally struggled to balance aircraft, having jet fighters boom-and-zoom and go repair in the endzone where they're untouchable is no good, they just don't couple to the battlefield very well. even tanks/helicopters have risk, but, planes just fly away and go repair. and if you make them weaker then they're not any good.
(it's very similar to sniper rifles in the sense that sniper rifles either 1-hit you and then they're not fun for anyone else, or they require multiple hits and then that's just not good compared to DMR/etc which allow you to spam shots and achieve generally lower TTKs on average if you assume one or two misses.)
but if you do smaller-scale conflicts, then airplanes can be older slower stuff like harriers or a-6 intruders, or propeller aircraft, and helicopters, etc. if planes can't just disappear over the battlefield in 5 seconds flat, then that's more of a chance for people on the ground to actually coordinate against them and gets you away from the "sniper-rifle problem".
we've seen some people modding cards to clamshell mode, apparently there is nothing burned into the core itself that determines whether it's a quadro or a 4090, or whether a 4090 should have 24GB or 48GB, just a resistor array on the PCB itself. So if you resolder it onto a new PCB with twice the RAM chips, you can make it a "4090 48GB", or even make it into a quadro.
this has been around for a while, I remember people doing this to turn 780s into titans/780 tis into titan black, but normally they weren't adding more RAM capacity, just trying to get the ECC working and stuff, they'd just mod a couple resistors and boom it reports as quadro.
In some senses you end up with convergent design, it's not a GPU, it's just a control system that commands a bunch of accelerator units with a high-bandwidth memory subsystem. But that could be ARM and an accelerator unit etc. Probably need fast networking.
But it's overall a crazy proposition to me. Like first off goog and amazon are gonna beat you to market on anything that looks good, and you have no real moat other than "I'm sam altman", and really there's no market penetration of the thing (or support in execution let alone actual research) etc. Training is a really hard problem to solve because right now it's absolutely firmly rooted in the CUDA ecosystem. Supposedly there may be a GPU Ocelot thing once again at some point but like, everyone just works with nvidia because they're the gpgpu ecosystem that matters.
Like, if you wanted to do this you did like Tesla and have Jim Keller design you a big fancy architecture for training fast at scale (Dojo). I guess they walked away from it or something and just didn't care anymore? Oops.
But, that's the problem, it's expensive to stay at the cutting edge. It's expensive to get the first chip, and you'll be going against competitors who have the scale to make their own in-house anyway. it's a crazy business decision to be throwing yourself on the silicon treadmill against intense competition just to give nvidia the finger. wack, hemad.
I was waiting for this to resurface in my recommendations
It's not just about what you need today, it's also about what you need in a couple years.
I think this is a real tough argument even in the high-end monitor market. isn't your $700 or $1200 or $2500 or $3500 going to get you more in 2 years if you wait?
why not wait to see what the monitor market has to offer when nvidia has cards to drive them?
it literally is the ironic mirror image of AMD's tech holding back the consoles. just a funny coincidence of fate, funny reversal.
What is different about AMD’s W7000 such that they can offer higher DP standards support than the RX 7000 consumer cards?
Talk about burying the lede in the last segment. Asus isn’t using the official connector and every other vendor thinks their connector is risky and probably defective. That’s not on nvidia, other than allowing it (and this is the reason why they ride partners’ asses sometimes on approval/etc).
The rest of the stuff is Igor still grinding the same old axe (pretty sure astron knows how to make a connector, if the connector is so delicate it would be broken by GN’s physical testing, etc) but if asus isn’t using the official connector and they’re disproportionately making up a huge number of the failures, that’s really an asus problem.