this post was submitted on 24 Nov 2023
1 points (100.0% liked)

Hardware

47 readers
1 users here now

A place for quality hardware news, reviews, and intelligent discussion.

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 11 months ago (1 children)

XMX extensions are designed for matrix multiplication operations in FP64, FP32, FP16, and bfloat16 formats, which are crucial in many AI algorithms, including neural networks

From the article.

[–] [email protected] 1 points 11 months ago (1 children)

bfloat16

I have no idea why this is important or really what it even is but Apple had a pretty video about it in their WWDC catalog so I guess it's a trend now.

[–] [email protected] 1 points 11 months ago (1 children)

Standard float16 uses 1bit sign + 5bit exponent + 10bit fraction.

bfloat16 uses 1bit sign + 8bit exponent + 7bit fraction.

bfloat16 basically gives the same exponent precision as a standard float32. But most neural networks don't require a huge fraction range. So bfloat16 gives you the possibility of executing 2x 8bit NP FLOPs vs using a float32 to do the same 1x8bit NP FLOP.

Having the ALU support this format allows for the scheduler to pack 4xbfloat16 that can be executed in parallel in a standard 64bit ALU. So basically you double or quadruple the 8bit NP FLOPs that you would get from using traditional float16/32 representations.

[–] [email protected] 1 points 11 months ago

Makes sense. Thanks for that!