overview for noneabove1182

GGUF PR has officially been merged into master! in c/localllama

[–] noneabove1182 1 points 2 years ago

The significance is we have a new file format standard, bad news is it breaks compatibility with the old format so you'll have to update to use newer quants and you can't use your old ones

The good news is this is the last time that'll happen (it's happened a few times so far) as this one is meant to be a lot more extensible and flexible, storing a ton of extra metadata for extra compatibility

The great news is that this paves the way for better model support as we've seen already with support for falcon being merged: https://github.com/ggerganov/llama.cpp/commit/cf658adc832badaaa2ca119fe86070e5a830f8f6

Meta’s Next AI Attack on OpenAI: Free Code-Generating Software in c/localllama

[–] noneabove1182 3 points 2 years ago

I hate when they do that so much too lol

SimpleSecretsManager: A python library to manage encrypted secrets in c/[email protected]

[–] noneabove1182 1 points 2 years ago* (last edited 2 years ago) (1 children)

Thanks for the comment! Yes this is meant more for your personal projects than for using in existing projects

The idea behind needing a password to get a password, totally understand, my main goal was to have local encrypted storage, the nice thing about this implementation is that you can have all your env files saved and shared in your git repo for all devs to have access to, but only can decrypt it if given the master password shared elsewhere (keeper, vault etc) so you don't have to load all values from a vault, just the master

100% though this doesn't cover a large range of usage, hence the name "simple" haha, wouldn't be opposed to expanding but I think it covers my proposed use cases as-is

A note on the importance of prompt and template formatting - as seen from starcoder in c/localllama

[–] noneabove1182 1 points 2 years ago

Sure it's a simplistic view, I meant it more that you can guide it towards completing a sentence, but you're right that it's worth recognizing what's actually going on!

A note on the importance of prompt and template formatting - as seen from starcoder in c/localllama

[–] noneabove1182 1 points 2 years ago (1 children)

That is interesting though how you interpreted the question, I think the principle of "rate limiting" is playing in my favour here where typically when you rate limit something you don't throw it into a queue, you deny it and wait for the next request (think APIs)

A note on the importance of prompt and template formatting - as seen from starcoder in c/localllama

[–] noneabove1182 2 points 2 years ago (2 children)

Your best bet is likely going to be editing the original prompt to add information until you get the right output, however, you can also get clever with it and add to the response of the model itself. Remember, all it's doing is filling in the most likely next word, so you could just add extra text at the end that says "now, to implement it in X way" or "I noticed I made a mistake in Y, to fix that " and then hit generate and let it continue the sentence

Google Assistant support on older watches ‘ending soon,’ Wear OS 3+ required in c/[email protected]

[–] noneabove1182 1 points 2 years ago

definitely for sure this time we promise

Mishaal Rahman: [Google] Assistant is already disabled for me [on the TicWatch Pro 3 running WearOS 2]. The Assistant icon no longer appears for me when I swipe to the left page. RIP. in c/[email protected]

[–] noneabove1182 2 points 2 years ago

link is broken

but content in the title is enough, just sad especially as an owner of a TicWatch Pro 3 Ultra.. been gathering dust in my drawer waiting for WearOS 3..

Google Assistant support on older watches ‘ending soon,’ Wear OS 3+ required in c/[email protected]

[–] noneabove1182 3 points 2 years ago (2 children)

cries in to watch pro 3 ultra

Samsung’s emoji are bad - 9to5google in c/[email protected]

[–] noneabove1182 4 points 2 years ago

still so sad about the death of blobbies :'(

GGUF progressing nicely, ggerganov is back on it tomorrow! in c/localllama

[–] noneabove1182 2 points 2 years ago

oh yeah definitely didn't mean "no more breaking changes", just that we've had several from ggml file format changes, and so THAT portion of the breaking is going away

GGUF progressing nicely, ggerganov is back on it tomorrow! in c/localllama

[–] noneabove1182 3 points 2 years ago (2 children)

it's a standardizing of a universal GGML format which would mean going forward no more breaking changes when new formats are worked on, and also includes the same functionality of llama.cpp for all GGML types (falcon, mpt, starcoder etc)

14

Open-Orca/OpenOrca-Preview1-13B · Hugging Face (huggingface.co)

submitted 2 years ago by noneabove1182 to c/localllama

0 comments fedilink

Open Orca preview trained on ~6% of data:

We have trained on less than 6% of our data, just to give a preview of what is possible while we further refine our dataset! We trained a refined selection of 200k GPT-4 entries from OpenOrca. We have filtered our GPT-4 augmentations to remove statements like, "As an AI language model..." and other responses which have been shown to harm model reasoning capabilities. Further details on our dataset curation practices will be forthcoming with our full model releases.

176

Molly sits wherever she pleases (sh.itjust.works)

submitted 2 years ago by noneabove1182 to c/[email protected]

5 comments fedilink

13

vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention (vllm.ai)

submitted 2 years ago by noneabove1182 to c/localllama

0 comments fedilink

https://github.com/vllm-project/vllm

vLLM is a fast and easy-to-use library for LLM inference and serving.

vLLM is fast with:

State-of-the-art serving throughput Efficient management of attention key and value memory with PagedAttention Continuous batching of incoming requests Optimized CUDA kernels vLLM is flexible and easy to use with:

Seamless integration with popular HuggingFace models High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more Tensor parallelism support for distributed inference Streaming outputs OpenAI-compatible API server

YouTube video describing it: https://youtu.be/1RxOYLa69Vw

6

Nothing Phone 2 Camera specs (twitter.com)

submitted 2 years ago by noneabove1182 to c/[email protected]

1 comments fedilink

𝗥𝗲𝗮𝗿: • 50MP (Sony IMX890) (f/1.9) (1/1.56") (OIS & EIS) Focal length: 24mm

• 50MP (Samsung JN1) (f/2.2) (1/2.7") (EIS) (FoV: 115°) Macro (4cm)

𝗦𝗲𝗹𝗳𝗶𝗲: 32MP (Sony IMX615) (f/2.4) (EIS)

2

Nothing Phone 2 camera spec leak/rumour (twitter.com)

submitted 2 years ago by noneabove1182 to c/[email protected]

0 comments fedilink

Nothing Phone 2 Camera specs

𝗥𝗲𝗮𝗿: • 50MP (Sony IMX890) (f/1.9) (1/1.56") (OIS & EIS) Focal length: 24mm

• 50MP (Samsung JN1) (f/2.2) (1/2.7") (EIS) (FoV: 115°) Macro (4cm)

𝗦𝗲𝗹𝗳𝗶𝗲: 32MP (Sony IMX615) (f/2.4) (EIS)

17

OpenOrca, an open-source dataset and series of instruct-tuned language models (erichartford.com)

submitted 2 years ago by noneabove1182 to c/localllama

5 comments fedilink

I realized that while Microsoft would probably release their LLaMA-13b based model (as of the time of this writing they still haven't) I concluded that they might not release the dataset. Therefore, I resolved to replicate their efforts, download the data myself, and train the model myself, so that OpenOrca can be released on other sizes of LLaMA as well as other foundational models such as Falcon, OpenLLaMA, RedPajama, MPT, RWKV.

13

Koboldcpp 1.33 released and dockerized (github.com)

submitted 2 years ago by noneabove1182 to c/localllama

1 comments fedilink

Koboldcpp 1.33 was released, and with it means new docker images :) anything with -gpu works with cublas now!

Released my updates for koboldcpp docker images for v1.33 (CUDA support!):

https://hub.docker.com/u/noneabove1182

there's also a new koboldcpp-gpu-test where i'm trying to reduce the image size, got it down to less than half of the original -gpu (1.58GB vs 3.87GB), everything seems to be working but if anyone else is willing to help validate that would be much appreciated

make sure if you're upgrading you clear out your docker volume, it does weird things during upgrades...

13

Google Pixel 8 leak points to deskop mode support (www.androidauthority.com)

submitted 2 years ago by noneabove1182 to c/[email protected]

5 comments fedilink

Exciting moving forward, hopefully it leads to display port being standard on Android

7

Qualcomm Announces Multi-Year Collaboration with Sony to Deliver Next Generation Smartphones (www.qualcomm.com)

submitted 2 years ago by noneabove1182 to c/[email protected]

3 comments fedilink

31

Microsoft makes new 1.3B coding LLM that outperforms all models on MBPP except GPT-4, reaches third place on HumanEval above GPT-3.5, and shows emergent properties (arxiv.org)