Congrats on being that guy
shaserlark
You’re aware that there’s the OpenAI API library right? https://github.com/openai/openai-python
It’s really nothing fancy especially on Lemmy where like 99% of people are software engineers…
Are you drunk?
Yeah I found some stats now and indeed you’re gonna wait like an hour to process if you throw like 80-100k token into a powerful model. With APIs that kinda works instantly, not surprising but just to give a comparison. Bummer.
Thanks! Hadn’t thought of YouTube at all but it’s super helpful. I guess that’ll help me decide if the extra Ram is worth it considering that inference will be much slower if I don’t go NVIDIA.
Yeah I was thinking about running something like Code Qwen 72B which apparently requires 145GB Ram to run the full model. But if it’s super slow especially with large context and I can only run small models at acceptable speed anyway it may be worth going NVIDIA alone for CUDA.
Proud of you. Done it a long time ago. Would do it again either
Seems like that extra $150 million extra in hasbara money is already in good use judging from your genocide denial and post history.
There are examples of people who rather go to jail than participate in genocide. It’s possible, the majority of society just doesn’t want to resist the system. The majority of people is more okay with participating in genocide than with facing the consequences of doing the right thing. In Nazi Germany they would have been those who were “just following orders“.
Meh, ofc I don’t.
Thanks, that’s very helpful! Will look into that type of build
Thanks for the reply, still reading here. Yeah thanks to the comments and reading some benchmarks I abandoned the idea of getting an Apple, it’s just too slow.
I was hoping to test Qwen 32B or llama 70b for running longer contexts, hence the apple seemed appealing.