this post was submitted on 30 Jan 2025
8 points (90.0% liked)

LocalLLaMA

2450 readers
26 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago
MODERATORS
 

Generate 5 thoughts, prune 3, branch, repeat. I think that’s what o1 pro and o3 do

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 1 points 3 hours ago

Well I think you actually need to train a "discriminator" model on rationality tests. Probably an encoder only model like BERT just to assign a score to thoughts. Then you do monte carlo tree search.