Hardware

47 readers

1 users here now

A place for quality hardware news, reviews, and intelligent discussion.

founded 1 year ago

MODERATORS

Tom's Hardware: "Intel's next-gen Arrow Lake GPU will have new Xe-LPG Plus Architecture with XMX" (www.tomshardware.com)

submitted 11 months ago by [email protected] to c/[email protected]

26 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 1 points 11 months ago (1 children)

bfloat16

I have no idea why this is important or really what it even is but Apple had a pretty video about it in their WWDC catalog so I guess it's a trend now.

[–] [email protected] 1 points 11 months ago (1 children)

Standard float16 uses 1bit sign + 5bit exponent + 10bit fraction.

bfloat16 uses 1bit sign + 8bit exponent + 7bit fraction.

bfloat16 basically gives the same exponent precision as a standard float32. But most neural networks don't require a huge fraction range. So bfloat16 gives you the possibility of executing 2x 8bit NP FLOPs vs using a float32 to do the same 1x8bit NP FLOP.

Having the ALU support this format allows for the scheduler to pack 4xbfloat16 that can be executed in parallel in a standard 64bit ALU. So basically you double or quadruple the 8bit NP FLOPs that you would get from using traditional float16/32 representations.

[–] [email protected] 1 points 11 months ago

Makes sense. Thanks for that!