this post was submitted on 01 Aug 2023

18 points (90.9% liked)

LocalLLaMA

3148 readers

38 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

Rules:

Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.

Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.

Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.

Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.

founded 2 years ago

MODERATORS

(Deleted for not relevant anymore) (piped.video)

submitted 2 years ago* (last edited 7 months ago) by [email protected] to c/localllama

36 comments fedilink hide all child comments

(Deleted for not relevant anymore)

top 26 comments

sorted by: hot top controversial new old

[–] [email protected] 6 points 2 years ago (1 children)

Thanks for giving us the highlights. I just hope, if AI has that big of an impact on our lives as some of us think... It somehow gets democratized and isn't just something under tight control of the big corporations that have 50M$ to spare.

[–] [email protected] 1 points 2 years ago

Of course! Please share this senate hearing around if you want to help. We need to bring awareness to what they are trying to do. Advocating for universal backdoors is insane...

[–] [email protected] 4 points 2 years ago (1 children)

I definitely don't agree with their opinions, and I think it would be unconstitutional as hell to implement the measures that they're advocating for, but it should be noted that if they are successful, we'll see such a spectacular torrenting and dark web LLM scene. I don't think there's any stopping this. They can try and they will lose.

[–] [email protected] 3 points 2 years ago* (last edited 2 years ago)

I hope so, but from what I can tell, we are going to have a repeat of the Patriot Act and the horrors that caused as showed by Edward Snowden.

The politicians are only getting one side of the argument about AI from CEOs and those in positions of power. It is important that the politicans recognize the good AI is doing as well. This is why I made this post to try to get some voice out there.

[–] PeterPoopshit 4 points 2 years ago* (last edited 2 years ago) (1 children)

So what I'm reading is we should download those open source non drm ai bot projects now while we still can and hoard all the data. Thanks for the warning.

[–] [email protected] 4 points 2 years ago

Of course. I know some open source devs that advice backing up raw training data, LoRa, and essentially the original base models for fine tuning.

Politicians sent an open letter out in protest when Meta released their LLaMA 2. It is not unreasonable to assume they will intervene for the next one unless we speak out against this.

[–] Meowoem 3 points 2 years ago (1 children)

Yeah they're eager to gain a monopoly on AI even (or maybe especially) if it results in average people missing out on all the potential advantages. I could list positive outcomes of AI for hours, enabling small and independent business to compete with current monopolies is one of the key advantages and that scares the rich, enabling people to access things like healthcare and key services easily scares them even more...

Imagine a world where a poor person such as you and a rich person like Elon both go to court on the same charge - your low paid legal team has about a dozen billable hours to research and it's done by one note especially motivated lawyer, Elons team have the top minds and the ability to have a vast team use the most expensive tools to dig up information and hire in experts - who is more likely to get off? But a good legal ai able to guide your tired lawyer to the right objections and fillings to make and which can in an instant pull up all the relevant information on the charge will mean that rich people no longer have such an advantage...

And take work and housing, you're working long hours and need somewhere to live so it's likely you'll feel pressured into clinging to whatever you can get because researching moving is hard, getting finance acceptance and everything is hard too especially when every attempt requires endless fees and forms and hoops to leap through - so when your landlord does something you don't like you just eat it, not only would the legal ai mentioned above totally change the landscape as you'll be able to simply say to your computer 'is my landlord allowed to let himself in to watch my sleep?' and it doesn't matter how poor you are or how little you know the culture or legal system the system will tell you your rights and how to enforce them. An ai able to ask you basic questions on your needs and search all available rental options then cross reference with other sources of information would be a game changer too, for regular people it could vastly reduce the stress and fear of housing -- that sucks for rich people who use the threat of destitution as a way of keeping people locked into bad situations.

These two AI implementations also result in another benefit to average people in allowing them to learn about and collect any help or resources they're entitled to, of an ai knew the law and my situation then it could tell me 'you're entitled to a carer subsidy as you look after your parents' or 'you can claim these items agsinst your tax, I've added them to the records and we can choose the best option when filling' or 'your car insurance is higher than other equal options, do you want me to switch it and save you $87?'

Again rich people have accountants that reduce their tax to zero by knowing exactly what to do while we have to struggle through purposely obtuse and opaque systems so we end up paying more than we should and missing out on everything we're entitled to - all that extra money we pay because we simply don't know we have other options goes into rich peoples pockets, this even more true for people with physical and educational handicaps, people in difficult situations, immigrants, victims of domestic abuse or poor home situations...

Something that really scares them too is the possibility of the ai telling you 'according to our personal records kept privately on your secure home system you purchased a product that has been discovered to have been manufactured using illegal and dangerous materials, do you want to join the class action lawsuit against Dupont?' then 18 months later it says 'the Dupont lawsuit was successful, do you want me to send them the information required to claim $1530?'

The law is actually often pretty fair and reasonable in many cases, long debates go into making sure it all makes sense and is even handed - the problem is access to the law is entirely dependent on how much money you have, being able to file a case with all the right boxes ticked would ruin their little game.

And that's only one strand of not especially complex AI, all tasks a LLM matched with verifier networks and task structuring could do - probably only one or two generations beyond the current models. Enabling collaborative design and community projects is another thing that could hugely benefit regular people, DRM ink for printers exists because most people don't know they have other options and they feel brand names are more supported but an ai able to code printer drivers on the fly gets rid of that issue as everything becomes true plug and play but also it allows consumers to select better options - I obsessively search tech stuff when I'm going to make a purchase but still often discover better options I would have selected, being able to say 'what are my best options for buying a printer?' and it asks a couple of questions then gives me a short list of ones I can actually get and include comparisons and test data in its thinking which would have taken me an evening in pandas just to get a ballpark understandong of - and it's looked at all the help forums telling me things like 'people have reported issues with this printer and your current hardware, though we could upgrade your firmware to circumvent problems'

The possibilities for AI making our lives better are huge, when I hear people who have nothing good to say about it all I can think it's the they must consider everyday people being able to live better lives as a bad thing, they must see it as something that threatens their existence as a modern day robber baron.

[–] paysrenttobirds 3 points 2 years ago (2 children)

What do they mean by watermarks? Why is it a bad idea to know which, if any, ai has produced something?

Thanks for the post

[–] [email protected] 3 points 2 years ago* (last edited 2 years ago) (2 children)

They are requesting for something beyond watermarking. Yes, it is good to have a robot tell you when it is making a film. What is particularly concerning is that the witnesses want the government to keep track of every prompt and output ever made to eventually be able to trace its origin. So all open source models must somehow encode some form of signature, much like the hidden yellow dots printers produce on every sheet.

There is a huge difference between a watermark stating that "this is ai generated" and having hidden encodings, much like a backdoor, where they can trace any pubicly released ai image, video, and perhaps even text output, to some specific model, or worse DRM required "yellow dot" injection.

I know researchers have already looked into encoding hidden undetectable patterns in text output, so an extension to everything else is not unjustified.

Also, if the encodings are not detectable by humans, then they have failed the original purpose of making ai generated content known.

[–] paysrenttobirds 2 points 2 years ago (1 children)

Thanks for the details. I guess the next step is to contact my congresspeople :)

[–] [email protected] 1 points 2 years ago

I will do the same. No problem! I'm very happy that my post was heard! Thank you!!!

[–] [email protected] 2 points 2 years ago* (last edited 2 years ago) (1 children)

I think the argumentation is several logical fallacies at once. And it's not either / or.

I don't see a reason why OpenAI and the other big companies shouldn't have incorporated watermarks from the beginning and voluntarily. The science is out there and it's really simple to do. And it solves a few valid problems.

I think valid uses are to find out if your pupils did their homework themselves, to fight spam and misinformation. There is no need to incorporate all kinds of data into the watermark to establish your surveillance fantasies and on the other hand it's stupid to say: "but it can be circumvented" or doesn't work in edge-cases and then don't do it at all. That's not a valid argument. You could say it disadvantages me if I have to do it but my competitors don't... But that's hardly the case if you're advertising to other people than criminals.

On a broader level, transparency is a good thing, if done right. I wouldn't like some AI driven dystopian future with intransparent social scores, credit scores and my CV being declined before some human reads it. However, we need to be able to use AI as a tool. Even for use cases like that. Transparency is the first step.

[–] Meowoem -1 points 2 years ago (1 children)

There's no technology that can embed a watermark into a paragraph of text without being obviously removable and degrading the quality of the text - it's also pointless.

The solution for schools worried about essays being written by chat GPT is to teach the kids how to use chat GPT and to integrate its use in writing essays - have you ever heard a teacher complain that their students might be getting answers from an encyclopedia? Or that they used a calculator to help solve their algebra homework? Do art teachers complain about rulers and reference images? No they teach the skills using available technology, you'd laugh and get angry if someone wanted to install spyware to meet sure you haven't used spell check when writing you thesis and this is the same thing.

Yes this means teachers can't set the same essay question and mark scheme they've used for a decade or four and will have to come up with something that takes account for the new technology, that allows the student to show understanding of the subject and tools available, and which are able to be marked based on the students ability rather than their choice of tool.

[–] [email protected] 2 points 2 years ago* (last edited 2 years ago) (1 children)

There’s no technology that can embed a watermark into a paragraph of text without being obviously removable

My point was: Exactly that is not a valid argument. This should not stop us doing the right thing in 95% of the cases and in the large commercial deployments that most people use.

and degrading the quality of the text

The paper A Watermark for Large Language Models says it has "negligible impact on text quality".

have you ever heard a teacher complain that their students might be getting answers from an encyclopedia?

That was the time i went to school. For a while we could just print wikipedia articles and be done with our presentations. It worked for a while especially with the older teachers that weren't yet aware of wikipedia. Fun times, homework oftentimes done in 5 minutes.

Or that they used a calculator

I'm starting to believe we grew up in different times/cultures. We were allowed to buy a calculator -i think- in grade 11. But our teacher did not allow to use it during tests for -i think- another year. And during that time you'd better keep up practicing calculating with your brain only or you'd be fucked getting everything done in time during the exams. I think I was able to use that calculator for about 1,5 to 2 years in school. And then of course in uni in most courses.

The thing is... When learning things: You need to learn the basics first. You need to grow an understanding of why something works. What happens in the background. What your tools actually do. If you give people powerful tools too early, they won't learn the concepts behind what they're doing. The tool will do that for them and they will only learn how to operate that specific tool.

Edit: And that's the right thing to do. It's the difference between a monkey pushing buttons and someone with a profound understanding of a topic. You want a proper education, otherwise you're obsolete at the point where someone invents a new tool that works not like the tool you're used to. Or you want to explore something new and no-one wrote an encyclopedia-article for you.

[–] Meowoem -1 points 2 years ago (1 children)

Well Wikipedia wasn't invented when I went to school and our teachers were very keen on us using encyclopedias and calculators - maybe your tech regressive attitude is cultural.

If you think using a calculator is just pressing buttons and cheating then you'd have to have stopped at a very basic level not much more advanced than basic multiplication and division - likewise gpt, if you think it's just pressing a button and getting the answer then you're using it for very simple tasks, certainly not scratching the surface of it's or your own capabilities. A skilled and engaged user doing extra steps is getting very different results to how you say it's used, this is the difference between a low graded homework and an A+ piece of work.

[–] [email protected] 1 points 2 years ago* (last edited 2 years ago) (1 children)

I think we're talking past each other here.

Prompt engineering language models and knowing how the arc of suspense is supposed to work in a novella are two entirely different things and skill sets. It kind of depends what you're trying to teach.

Are you able to calculate if a large pizza is more expensive or cheaper than two small pizzas just with a calculator, without storing basic concepts about how circles work inside of your brain?

Having knowledge about concepts, being literate and able to connect thoughts is what makes you smart. And things add up once problems start to become more difficult than mere examples. Try and be a philosopher without reading anything about Adorno, Kant and the ancient greeks because you "can look it up".... Using a calculator or encyclopedia and modern computer tools is the 5% on top that makes you fast and excel at things. 95% is hard work. And that is why I think focusing on teaching it that way is the right thing to do. And then add the 5% on top. Just don't skip that like my teachers sometimes did. Background knowledge is important to have. So are applied skills and to know how to use your tool kit.

[–] Meowoem 0 points 2 years ago

Exactly, and using modern tools like calculators is what allows people to focus on learning background knowledge and theory - you can easily use computer to determine the area of a two circles and the price of each per square centimetre without having to remember formula or do mental arithmetic, someone who does it by hand is going to take longer thus giving the other the advantage of being able to do far more complex questions in the same amount of time - like comparing the calorific intake from various sizes and topped pizzas and constructing a nice graph or table to show results.

The truth is we currently accept a very low quality in everything right from kids homework to media reporting on politics, when we adapt to using AI tools to help construct articles we'll start seeing much better made augments and much better analysis - things like actual fact checking will become the norm instead of a six month project that blows everyone's minds but then gets forgotten.

Imagine a world where journalists job isn't to string pretty words together but to get stories and give them context, where experts opinions get included rather than overlooked because the person writing the article simply has no idea there's a whole scientific body that studies the field and instead blindly trusts some corporate spokespersons press release, where a journalist doesn't have to spend thirty hours reading through archives trying to determine if the subject of his story has related history but can simply say 'give a detailed breakdown of accusations of safety violations from Dupont'

Of course there will still be a lot of writing to do, not in the key pressing way where you waste an hour trying to think of a good word to describe butter beans but looking at paragraphs and saying 'that's a bit dense, split the bit about shell poisoning the Niger Delta into it's own paragraph then add a short summary of the economic cost from the data we were looking at in section 1'

Being able to focus on the important things will make us able to produce better stuff - schools that teach how to use AI are going to make students who are able to compete and contribute in the modern world, schools that try to force their students to live in 1990 are going to produce kids that've already been left behind.

[–] [email protected] 1 points 2 years ago

Also no problem! I feel like I had to share this one.

[–] MrNobody 1 points 2 years ago (1 children)

Its impossible to regulate open source, AI or not. Doing so would be another brick in the wall, helping to cripple whatever region tries to regulate it. Just because some countries want to regulate it to try to control its power, it's already too late. Bad actors aren't waiting for anything, they have a head start. They don't work on ethics or morals so have no problem doing what they want, but even discounting that you have other countries with their own citizens who can work on AI. The cat's out of the bag, the elite see the power and danger AI can bring them, and think that restricting who can utilise it they can be safe. But for how long?

[–] [email protected] 2 points 2 years ago

It would be difficult indeed, but without a doubt they will still try and cause massive damage to our basic freedoms. For example, imagine if one day all chips require DRMs at the hardware level that cannot be disabled. This is just one example of the damage they could do. There isn't much any consumer can do if they do this since developing your own GPU is nearly impossible.

[–] [email protected] 1 points 7 months ago

What are you doing?

[–] Saledovil 0 points 2 years ago

DRM on the chip seems not really feasible to me. In the end, the chip doesn't know what it is doing. It just does math. So how can any DRM on that level realize that it is running a forbidden model, or that a jailbreak prompt is being executed? Finding out what a program does already non trivial if you have the source code, and the DRM of the chip would only have the source code.

load more comments