this post was submitted on 15 Nov 2023
1 points (100.0% liked)

Hardware

33 readers
1 users here now

A place for quality hardware news, reviews, and intelligent discussion.

founded 11 months ago
MODERATORS
 

Just wondering,what AMD would need to do..to at least MATCH nvidias offering in A.I/dlss/Ray tracing tech

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 0 points 10 months ago (30 children)

Just wondering,what AMD would need to do..

They'd have to actually spend chip real estate on DL/RT.

So far they've been talking about using DL for gameplay instead of graphics. So no dedicated tensor units.

And their RT has been mostly just there to keep up with NV feature wise. They did enhance it somewhat in RDNA3 apparently. But NV isn't waiting for them either.

[–] [email protected] 0 points 10 months ago (13 children)

Second gen AMD HW ray-tracing still has a worse performance impact than Intel first gen HW ray-tracing. No need to talk about Nvidia here, as they are miles ahead. Either AMD is not willing to expend more resources on RT or they aren't able to improve performance.

[–] [email protected] 1 points 10 months ago (9 children)

AMD's RT hardware is intrinsically tied to the texture unit, which was probably a good decision at the start since Nvidia kinda caught them with their pants down and they needed something fast to implement (especially with consoles looming overhead, wouldn't want the entire generation to lack any form of RT).

Now, though, I think it's giving them a lot of problems because it's really not a scalable design. I hope they eventually implement a proper dedicated unit like Nvidia and Intel have.

[–] [email protected] 1 points 10 months ago

There's not really anything intrinsically /wrong/ with tying it to the same latency-hiding mechanisms as the texture unit (there's nothing in the ISA that /requires/ it to be implemented in the texture unit, more likely that's already the biggest read bandwidth connection to the memory bus so may as well piggyback off it). I honestly wouldn't be surprised if the nvidia units were implemented in a similar place - as it needs to be heavily integrated to the shader units, while also having a direct fast path to memory reads.

One big difference is that nvidia's unit can do a whole tree traversal with no shader interaction, while the AMD one just does a single node test and expansion then needs the shader to queue the next level. This means AMD's implementation is great for hardware simplicity, and if the there's always a shader scheduled that is doing a good mix of RT and non-RT instructions it's not really much slower.

But that doesn't really happen in the real world - the BVH lookups are normally all concentrated to an RT pass and not spread over all shaders over the frame. And that batch tends to not have enough other work to be doing to fill the pipeline while waiting for the BVH lookups. If you're just waiting on a tight loop of BVH lookups, the pass back to the shader to just submit the next BVH lookup is a break in the pipelining or prefetching you might otherwise be able to do.

But it might also be more flexible - anything that that looks a bit like a BVH might be able to do fun things with the BVH lookup/triangle-ray intersection instructions, not just raytracing, but there simply doesn't seem to be a glut of use cases for that as-is. And then unused flexibility is just inefficiency, after all.

load more comments (8 replies)
load more comments (11 replies)
load more comments (27 replies)