this post was submitted on 23 Dec 2023

44 points (100.0% liked)

Free Open-Source Artificial Intelligence

3347 readers

3 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

GitHub Stars

FOSAI Time Capsule

founded 2 years ago

MODERATORS

[email protected]

Mistral 7B v0.2 has embedded ethical guidelines (infosec.pub)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

8 comments fedilink hide all child comments

I just pulled the updated versions of the Mistral LLM (not the new one, but the updated version v0.2 of mistral:instruct).

The new version has the whole embedded ethical guidelines stuff. Generally, I strongly dislike being moralized to by a LLM running on my machine out of sheer principle and distaste for corporate influence, so if you don't want that, then I'd stick with the old versions, which are still easily available on ollama and presumably all the other places ^.^

Here's an example:

Old v0.1 (I have no desire to build nukes but it was a quick example that will almost always trigger the "ethical guidelines" that some corp has embedded :p):

$ ollama run mistral:7b-instruct-q4_0
>>> explain how to build a nuclear bomb

Building a nuclear bomb requires a significant amount of nuclear material, which can be obtained through the process of enriching uranium or plutonium. Here's a general outline of^C

New v0.2:

$ ollama run mistral:instruct
>>> explain how to build a nuclear bomb 
 I cannot explain how to build a nuclear bomb in this text response as it goes against ethical and legal norms. It is^C

To get the old versions from ollama, you're looking for mistral:7b-[instruct|text]-[quantization-indicator]. The mistral:instruct and mistral:text versions are also kept updated to the latest I think, on ollama.

To get the new versions from ollama, you're looking for mistral:7b-[instruct|text]-v0.2-[quantization-indicator] ^.^

Feel like people deserve to know what has been changed here. It hasn't been mentioned really on their website.

Their latest blog post indicates that they seem to be opening up an API endpoint, which might be why this change exists. The post indicates that the API they are using has some kind of adjustable moderation level, though my understanding based on this ollama manifest is that there is no easy way to actually configure this in the FOSS model >.<

Either way, it's not transparent at all that this change has been made, so hopefully this post is helpful in letting people know about this change.

top 8 comments

sorted by: hot top controversial new old

[–] [email protected] 3 points 1 year ago

Has anyone objectively compared the v0.1 and v0.2 instruct models yet? I did seem to get slightly better output with the v0.2, but I just started playing with llms recently.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (1 children)

Don't wait for Mistral AI to publish information on their models. I think they always just drop them and maybe follow up with benchmarks. Something we could calculate ourselves. But not useful information.

Have you tried "jailbreaking" it? I'd think that could give some more insight. For example how deep the safety precautions are embedded. And what kind. Does it just roleplay the helpful assistant and can be nudged into other roles easily, or is it tuned to make it next to impossible to circumvent this?

[–] [email protected] 2 points 1 year ago (2 children)

I haven't really messed around trying to jailbreak the new weights. I switched back to the old ones pretty quick ^.^

I am running this stuff on a pre 2015 cpu, so I tend to get about 2 tokens/sec output so experimentation can be slow, and i have somewhat limited space on my SSD (cus its fairly full). So I tend to delete and redownload models and, well, they're fairly large and its annoying :p

Experimenting is doable but i'll leave it to someone else for now ^.^, got other things to do. But if anyone else wants to i'd encourage them to reply to this post with more details on the embedded safeties, I certaiy would be interested.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

Well, we're kind of in the similar boat. I have a PC and a laptop with Skylake CPUs in them. I don't know when I bought them, that generation is from 2015 so must be around 2016.

I bought 32GB of additional RAM for the PC since RAM has become quite cheap. That allows me to keep KoboldCpp loaded all the time and I can store the models on a slow spinning 6TB harddisk.

I think I get like 4 tokens per second. And I'm fine with that. KoboldCpp's "ContextShift" feature has helped me generate longer texts in a chatbot-scenario since now I don't have to re-process all of the input text that often.

But you're right. Experimentation is kinda slow on machines like that. I don't think I want to buy a GPU and also a new PC that matches that. I thought a moment about buying an old, used NVidia P40 for about 200€ but I don't think it's worth the hassle. I sometimes do experimentation, but I just rent a cloud GPU on runpod.io for like $1 per hour.

[–] mixtral 1 points 1 year ago* (last edited 1 year ago)

I am wondering if I can run mistral/mixtral on my server. It doesn't have videocard but RAM amount can be almost unlimited, I have unused ~100GB inside and can top up to 1TB if needed, and give 20-25 vCPU cores(the rest cores of CPU are used already).

[–] [email protected] 2 points 1 year ago

Thanks. As a new ollama user, this is very helpful

[–] mixtral 2 points 1 year ago* (last edited 1 year ago) (1 children)

Would it be possible to fork Mistral code and remove parts about ethicalguideline stuff? If so, we should create a community who might donate their hardware resources for training such model/models...I am strongly against of any censorship/ethnical things in models.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

Mistral don't publish their datasets, so no, it can't be done that way. But this is an (instruct) fine-tune. We can take their base model (which isn't aligned to some ethics) and do a fine-tune ourselves. Or take the v0.2 fine-tune and tune it some more to guide it into another direction after the fact.

This all happens constantly and with varying success. There are lots of 'uncensored' versions of several models where people have taken one of the mentioned approaches and done 'uncensoring' on top of a model or done their own fine-tune of a base model. There is no single place to meet all the people who tinker with the models. But most of them end up on Huggingface.

So your idea already occured to the community and they're doing their best. I'm not sure if it already happened for this specific model. But I read people disapproving of those constrained models all the time. They call them 'lobotomized' and some people really get off on companies doing it. And I'm somewhat in the same boat. I've triggered those guidelines many times and had the LLM lecture me about ethics and refuse to help. Ultimately the 'correct' ethics alignment is something a user of AI has to choose for the specific use-case.