Apparently it implements chain-of-thought, which either means they changed the RHFL dataset to force it to explain its 'reasoning' when answering or to do self questioning loops, or that it reprompts itsefl multiple times behind the scenes according to some heuristic until it synthesize a best result, it's not really clear.
Can't wait to waste five pools of drinkable water to be told to use C# features that don't exist, but at least it got like 25.2452323760909304593095% better at solving math olympiads as long as you allow it a few tens of tries for each question.
Strong they cut all my deadlines in half and gave me an OpenAI API key, so fuck it energy.
You don't say.