this post was submitted on 14 Nov 2023
3 points (100.0% liked)

Hardware

33 readers
1 users here now

A place for quality hardware news, reviews, and intelligent discussion.

founded 11 months ago
MODERATORS
top 4 comments
sorted by: hot top controversial new old
[–] [email protected] 1 points 10 months ago (2 children)

What exactly did US export limit? FP64 performance?

[–] [email protected] 1 points 10 months ago (1 children)

Fp8 performance per unit die apparently.

The 4090 has 660 fp8 tflops (which is insane when you think about it) and got the ban.

H100 is 1930 Fp8 tflops.

With sparsity both can do upto 2x that

The 4090 has more fp32 performance than the H100 though.

[–] [email protected] 1 points 10 months ago

It's not specifically fp8, but TOPS*data size. Absolute limit is 4800, or 5.8/mm^(2). Above either is an outright ban. Above half of either needs a license.

[–] [email protected] 1 points 10 months ago

It's slightly complex, as there are two metrics it needs to be under.

See this chart:

https://cdn.mos.cms.futurecdn.net/dHjnhPMk93HuDPBYnXBzLV.png

Total Processing Performance (TPP) is essentially the listed processing power multiplied by the length of operation (e.g., FLOPS or TOPS ‘8/16/32/64) without sparsity

Performance Density is counted by dividing TPP by the die area measured in square millimeters