It's best not to dwell on it : whitepeopletwitter

[–] [email protected] 32 points 1 day ago (2 children)

ChatGPT is a tool. Use it for tasks where the cost of verifying the output is correct is less than the cost of doing it by hand.

[–] [email protected] 18 points 1 day ago (1 children)

Honestly, I've found it best for quickly reformatting text and other content. It should live and die as a clerical tool.

[–] [email protected] 2 points 22 hours ago

Which is exactly why every time I see big tech companies making another stupid implementation of it, it pisses me off.

LLMs like ChatGPT are fundamentally word probability machines. They predict the probability of words based on context (or if not given context, just the general probability) when given notes, for instance, they have all the context and knowledge, and all they have to do it predict the most statistically probable way of formatting the existing data into a better structure. Literally the perfect use case for the technology.

Even in similar contexts that don't immediately seem like "text reformatting," it's extremely handy. For instance, Linkwarden can auto-tag your bookmarks, based on a predetermined list you set, using the context of each page fed into a model running via Ollama. Great feature, very useful.

Yet somehow, every tech company manages to use it in every way except that when developing products with it. It's so discouraging to see.

[–] [email protected] 3 points 1 day ago (1 children)

Youre still doing it by hand to verify in any scientific capacity. I only use ChatGPT for philosophical hypotheticals involving the far future. We’re both wrong but it’s fun for the back and forth.

[–] [email protected] 3 points 1 day ago* (last edited 1 day ago) (1 children)

It is not true in general that verifying output for a science-related prompt requires doing it by hand, where "doing it by hand" means putting in the effort to answer the prompt manually without using AI.

[–] [email protected] 1 points 1 day ago (1 children)

You can get pretty in the weeds with conversions on ChatGPT in the chemistry world or even just basic lab work where a small miscalculation at scale can cost thousands of dollars or invite lawsuits.

I check against actual calibrated equipment as a verification final step.

[–] [email protected] 1 points 1 day ago (1 children)

I said not true in general. I don't know much about chemistry. It may be more true in chemistry.

Coding is different. In many situations it can be cheap to test or eyeball the output.

Crucially, in nearly any subject, it can give you leads. Nobody expects every lead to pan out. But leads are hard to find.

[–] [email protected] 1 points 1 day ago (1 children)

I imagine ChatGPT and code is a lot like air and water.

Both parts are in the other part. Meaning llm is probably more native at learning reading and writing code than it is at interpreting engineering standards worldwide and allocation the exact thread pitch for a bolt you need to order thousands of. Go and thread one to verify.

[–] [email protected] 1 points 1 day ago (1 children)

This is possibly true due to the bias of the people who made it. But I reject the notion that because ChatGPT is made of code per se that it must understand code better than other subjects. Are humans good at biology for this reason?

[–] [email protected] 1 points 1 day ago

You might know better than me. If you ask ChatGPT to write the code for itself I have no way to verify it. You would.

[–] [email protected] 6 points 23 hours ago

If it's being designed to answer questions, then it should simply be an advanced search engine that points to actual researched content.

The way it acts now, it's trying to be an expert based one "something a friend of a friend said", and that makes it confidently wrong far too often.

[–] [email protected] 17 points 1 day ago* (last edited 1 day ago) (2 children)

I feel this hard with the New York Times.

99% of the time, I feel like it covers subjects adequately. It might be a bit further right than me, but for a general US source, I feel it’s rather representative.

Then they write a story about something happening to low income US people, and it’s just social and logical salad. They report, it appears as though they analytically look at data, instead of talking to people. Statisticians will tell you, and this is subtle: conclusions made at one level of detail cannot be generalized to another level of detail. Looking at data without talking with people is fallacious for social issues. The NYT needs to understand this, but meanwhile they are horrifically insensitive bordering on destructive at times.

“The jackboot only jumps down on people standing up”

Hozier, “Jackboot Jump”

Then I read the next story and I take it as credible without much critical thought or evidence. Bias is strange.

[–] CancerMancer 9 points 1 day ago (1 children)

There is a name for this: Gell-Mann amnesia effect

load more comments (1 replies)

[–] [email protected] 4 points 1 day ago (1 children)

Can you give me an example of conclusions on one level of detail can't be generalised to another level? I can't quite understand it

[–] [email protected] 6 points 1 day ago* (last edited 1 day ago) (1 children)

Perhaps the textbook example is the Simpson’s Paradox.

This article goes through a couple cases where naively and statically conclusions are supported, but when you correctly separate the data, those conclusions reverse themselves.

Another relevant issue is Aggregation Bias. This article has an example where conclusions about a population hold inversely with individuals of that population.

And the last one I can think of is MAUP, which deals with the fact that statistics are very sensitive in whatever process is used to divvy up a space. This is commonly referenced in spatial statistics but has more broad implications I believe.

This is not to say that you can never generalize, and indeed, often a big goal of statistics is to answer questions about populations using only information from a subset of individuals in that population.

All Models Are Wrong, Some are Useful

George Box

The argument I was making is that the NYT will authoritatively make conclusions without taking into account the individual, looking only at the population level, and not only is that oftentimes dubious, sometimes it’s actively detrimental. They don’t seem to me to prove their due diligence in mitigating the risk that comes with such dubious assumptions, hence the cynic in me left that Hozier quote.

[–] [email protected] 4 points 1 day ago

That's really interesting and I really appreciate you writing that out

[–] [email protected] 25 points 1 day ago (2 children)

Talking with an AI model is like talking with that one friend, that is always high that thinks they know everything. But they have a wide enough interest set that they can actually piece together an idea, most of the time wrong, about any subject.

[–] [email protected] 21 points 1 day ago

Isn't this called "the Joe Rogan experience"?

[–] [email protected] 3 points 1 day ago

I am sorry to say I can frequently be this friend...

[–] [email protected] 5 points 1 day ago

I have frequentley seen gpt give a wrong answer to a question, get told that its incorrect, and the bot fights with me and insists Im wrong. and on other less serious matters Ive seen it immediatley fold and take any answer I give it as "correct"

[–] multiplemigs 5 points 1 day ago* (last edited 1 day ago)

come on guys, the joke is right there.... 60% of the time it works, every time!

[–] [email protected] 164 points 2 days ago (2 children)

I love that this mirrors the experience of experts on social media like reddit, which was used for training chatgpt...

[–] [email protected] 67 points 2 days ago (2 children)

it's much older than reddit https://en.wikipedia.org/wiki/Gell-Mann_amnesia_effect

[–] [email protected] 13 points 1 day ago

i was going to post this, too.

The Gell-Mann amnesia effect is a cognitive bias describing the tendency of individuals to critically assess media reports in a domain they are knowledgeable about, yet continue to trust reporting in other areas despite recognizing similar potential inaccuracies.

load more comments (1 replies)

[–] [email protected] 44 points 2 days ago* (last edited 2 days ago) (7 children)

Also common in news. There’s an old saying along the lines of “everyone trusts the news until they talk about your job.” Basically, the news is focused on getting info out quickly. Every station is rushing to be the first to break a story. So the people writing the teleprompter usually only have a few minutes (at best) to research anything before it goes live in front of the anchor. This means that you’re only ever going to get the most surface level info, even when the talking heads claim to be doing deep dives on a topic. It also means they’re going to be misleading or blatantly wrong a lot of the time, because they’re basically just parroting the top google result regardless of accuracy.

load more comments (7 replies)

[–] [email protected] 4 points 1 day ago

Exactly my thoughts.

[–] [email protected] 7 points 1 day ago (1 children)

One thing I have found it to be useful for is changing the tone if what I write.

I tend to write very clinicaly because my job involves a lot of that style of writing. I have started asked chat gpt to rephrase what i write in a softer tone.

Not for everything, but for example when Im texting my girlfriend who is feeling insecure. It has helped me a lot! I always read thrugh it to make sure it did not change any of the meaning or add anything, but so far it has been pretty good at changing the tone.

Also use it to rephrase emails at work to make it sound more professional.

load more comments (1 replies)

[–] [email protected] 79 points 2 days ago (1 children)

First off, the beauty of these two posts being beside each other is palpable.

Second, as you can see on the picture, it's more like 60%

[–] [email protected] 25 points 2 days ago (1 children)

No it's not. If you actually read the study, it's about AI search engines correctly finding and citing the source of a given quote, not general correctness, and not just the plain model

[–] [email protected] 29 points 2 days ago

Read the study? Why would i do that when there's an infographic right there?

(thank you for the clarification, i actually appreciate it)

[–] [email protected] 33 points 1 day ago (2 children)

If the standard is replicating human level intelligence and behavior, making up shit just to get you to go away about 40% of the time kind of checks out. In fact, I bet it hallucinates less and is wrong less often than most people you work with

load more comments (2 replies)

[–] [email protected] 3 points 1 day ago

i mainly use it for fact checking sources from the internet and looking for bias. i double check everything of course. beyond that its good for rule checking for MTG commander games, and deck building. i mainly use it for its search function.

[–] [email protected] 3 points 1 day ago

does chat gpt have ADHD?

[–] [email protected] 3 points 1 day ago

same with every documentary out there

[–] [email protected] 57 points 2 days ago (3 children)

40% seems low

load more comments (3 replies)

[–] [email protected] 9 points 1 day ago (1 children)

I use chatgpt as a suggestion. Like an aid to whatever it is that I’m doing. It either helps me or it doesn’t, but I always have my critical thinking hat on.

[–] [email protected] 2 points 1 day ago

Same. It's an idea generator. I asked what kinda pie should I should make. I saw one I liked and then googled a real recipe.

I needed a SQL query for work. It gave me different methods of optimization. I then googled those methods, implemented, and tested it.

[–] [email protected] 2 points 1 day ago

Exactly this is why I have a love/hate relationship with just about any LLM.

I love it most for generating code samples (small enough that I can manually check them, not entire files/projects) and re-writing existing text, again small enough to verify everything. Common theme being that I have to re-read its output a few times, to make 100% sure it hasn't made some random mistake.

I'm not entirely sure we're going to resolve this without additional technology, outside of 'the LLM'-itself.

[–] [email protected] 14 points 1 day ago (12 children)

I did a google search to find out how much i pay for water, the water department where I live bills by the MCF (1,000 cubic feet). The AI Overview told me an MCF was one million cubic feet. It's a unit of measurement. It's not subjective, not an opinion and AI still got it wrong.

[–] [email protected] 2 points 1 day ago

I just think you need an abbrevations chart.

[–] [email protected] 4 points 1 day ago

Shouldn't it be kcf? Or tcf if you're desperate to avoid standard prefixes?

load more comments (10 replies)

People Twitter