this post was submitted on 31 Jul 2023
3 points (100.0% liked)

The AI Community On Kbin

13 readers
1 users here now

Welcome to m/ArtificialIntelligence, the place to discuss all things related to artificial intelligence, machine learning, deep learning, natural language processing, computer vision, robotics, and more. Whether you are a researcher, a developer, a student, or just a curious person, you can find here the latest news, articles, projects, tutorials, and resources on AI and its applications. You can also ask questions, share your ideas, showcase your work, or join the debates and challenges. Please follow the rules and be respectful to each other. Enjoy your stay!

founded 1 year ago
 

"We are about to train models that are 10 times larger than the cutting edge GPT-4 and then 100 times larger than GPT-4. That’s what things look like over the next 18 months."

top 1 comments
sorted by: hot top controversial new old
[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

Apparently Inflection AI have bought 22,000 H100 GPUs. The H100 has approximately 4x the compute for transformers as the A100. GPT4 is rumored to be 10x larger than GPT3. GPT3 takes approximately 34 days to train on 1024 A100 GPUs.

So with 22,000*4/1024=85.9375x more compute, they could easily do 10x GPT4 size in 1-2 months. Getting to 100x the size would be feasible but likely they're banking on the claimed speedup of 3x from FlashAttention-2, which would result in about 6 months of training.

It's crazy that these scales and timelines seem plausible.

load more comments
view more: next ›