this post was submitted on 23 Oct 2024
190 points (95.7% liked)
Technology
59675 readers
3257 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
If anything, this is a glaring example of how LLMs are not "intelligent." The LLM cannot and did not catch that he was speaking figuratively. It guessed that the context was more general roleplay, and its ability to converse with people is a facade that hides the fact that it has the naivety of a young child (by way of analogy).
Even talking about it this way is misleading. An LLM doesn't "guess" or "catch" anything, because it is not capable of comprehending the meaning of words. It's a statistical sentence generator; no more, no less.
Yeah, you're right, I just didn't want to put quotes around everything.
You’re sooooo right. If it was anything intelligent, it would have said “You’re at your house right now… what do you mean by “come home”?
The model should basically refuse to engage for some time after suicide ideation is brought up, besides mentioning help. "I'm sorry but this is not something am qualified to help with, if you need to talk please call 988."
Then the next day, "are you feeling better? We can talk if you promise never to do that again."
its an LLM, not a computer program. you can't just program it. these companies are idiotic
We're still interacting with LLMs through layers of classical software, which can be programmed to detect phrases related to suicide.
lol, glad you think so
Sorry if I offended you? My point is just that it's possible to make a crappy "is forbidden topic" classifier with a regular expression. Probably good enough to completely obliterate the topic in chats between humans and bots. Definitely good enough to claim you attempted to develop guardrails for vulnerable users.
have you ever tried to censor chats before? people will easily get around a regex filter
In chats between humans, I agree that it's near pointless to try to censor. In chats between humans and LLMs, I suspect you can get pretty far with regex or badwords.txt filtering. That said, I haven't tried, so who knows.