RapidInference9001

joined 1 year ago
[–] [email protected] 1 points 1 year ago (2 children)

Memory bandwidth is down. M2 Pro had 200GB/s, M3 Pro only has 150GB/s. M3 Max only has 400GB/s on the higher binned part.

This really puzzles me. One of the impressive things about the M2 Max and Ultra was how good they were at running local LLMs and other AI models (for a component not made by Nvidia and only costing a few grand). Mostly because of their high memory bandwidth, since that tends to be the limiting factor for LLMs over raw GPU TFLOPS. So for LLM use, this is *really* shooting themselves in the foot. Guess I better buy an M2 Ultra Mac Studio before they get around to downgrading it to the M3 Ultra.