this post was submitted on 05 Sep 2023

324 points (98.8% liked)

Technology

59689 readers

3370 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

324

Professor Caught Using ChatGPT When Scientific Paper Was Full of Errors (futurism.com)

submitted 1 year ago by [email protected] to c/[email protected]

29 comments fedilink hide all child comments

A biologist was shocked to find his name was mentioned several times in a scientific paper, which references papers that simply don't exist.

all 30 comments

sorted by: hot top controversial new old

[–] [email protected] 94 points 1 year ago (1 children)

Brandolini's law, aka the "bullshit asymmetry principle" : the amount of energy needed to refute bullshit is an order of magnitude bigger than that needed to produce it.

Unfortunately, with the advent of large language models like ChatGPT, the quantity of bullshit being produced is accelerating and is already outpacing the ability to refute it.

[–] [email protected] 14 points 1 year ago (3 children)

I'm curious to see if AI tech can actually help fight some of the bullshit out there someday. I agree that current AI is only making it easier to produce bullshit, but I think with some advances it could be used to parse a long-winded batch of bullshit, and summarize it, maybe with bullet points about how the source material is wrong. If they can make an AI as confident as chatgpt, but without as much of the "makes stuff up left and right" it could be useful.

THEN we just have to worry about who owns the AI that parses and summarizes the info we take in, and what kind of biases they've baked into the tech...

[–] [email protected] 10 points 1 year ago

I'm curious to see if AI tech can actually help fight some of the bullshit out there someday.

It is one of the most difficult problems on earth: to decide between lie or truth.

And then think about the fine line when detecting irony, half-irony or other forms of humoristic non-truth.

[–] [email protected] 3 points 1 year ago

I have high hopes for concepts like Toolformer where the model has to learn to use external APIs and resources like Wikipedia or Wolfram to get answers, rather than relying on the inscrutable and garbled soup of knowledge absorbed from the text training corpus directly. Systems plugged into knowledge graphs could have the best of both worlds - able to generate well-written novel text outputs AND the added rigor of "classical AI" style interpretability.

[–] [email protected] 1 points 1 year ago

I’m curious to see if AI tech can actually help fight some of the bullshit out there

Those AI are the best ones to produce fake scientific papers. It's a cat and mouse game again. Those who can detect bullshit can produce the best bullshit.

[–] [email protected] 46 points 1 year ago* (last edited 1 year ago) (2 children)

As someone who just submitted an article for review I am gobsmacked by how brazenly the authors have done this. The absolute disregard for integrity and the knowledge production process is astounding. But also the balls to just submit a paper like this without fear of the consequences says something more profound about the state of academia.

[–] [email protected] 2 points 1 year ago

I recently watched an interview with an academic woman who trolled the academia with completely wrong things in his papers, and yet it was approved. Academics, and scientific data is not so objective as it may seem from the outside.

[–] [email protected] 32 points 1 year ago* (last edited 1 year ago) (3 children)

Stupid question: Why can't journals just mandate an actual URL link to a study on the last page, or the exact issue something was printed in? Surely both of those would be easily confirmable, and both would be easy for a scientist using "real" sources to source (since they must have access to it themselves already).

Like, it feels silly to me that high school teachers require this sort of thing, yet scientific journals do not?

[–] [email protected] 43 points 1 year ago (1 children)

Because scientific journals exist to profit off science, not bolster it. Fact checking costs money so they do the bare minimum they deem necessary to preserve their reputation.

[–] [email protected] 12 points 1 year ago

It's always greed

[–] [email protected] 23 points 1 year ago

Many of the journals I've published in do require a link, usually a PMID or DOI, but they're not usually part of the review process. That is, one doesn't expect academic content reviewers to validate each of the citations, but it's not unreasonable to imagine a journal having an automated validator. The review process really isn't structured to detect fraud. It looks like the article in question was in the preprint stage - i.e.: not even reviewed yet - and I didn't notice mention of where they were submitted.

Message here should be that the process works and the fake article never got published. Very different than the periodic stories about someone who submits a blatantly fake, but hand written, article to a bullshit journal and gets published.

[–] [email protected] 7 points 1 year ago (1 children)

Well that used to be a thing called a bibliography but it appears that these journals don't require such. Funny when even my old 7gr essays required those

[–] [email protected] 4 points 1 year ago (1 children)

Of course they do. How do you think fake references were included if references were not needed?

[–] [email protected] 1 points 1 year ago (1 children)

Citing sources by name rather than providing full links/ISBN's/etc?

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (1 children)

Ah! "Bibliography" is an ambiguous term.

As the linked article says, one measure that journals are starting to adopt is requiring DOI or PMID links for each reference. It ought to be standard anyway, it's much less work for reviewers to check the references if they're easy to find. Even if they exist, they often don't say what the authors cite them as saying. But journals don't pay anyone for checking these things so it often doesn't get done. Peer review needs to be paid for. For-profit journals need to die.

[–] [email protected] 2 points 1 year ago

Yeah that's fair. Since Covid I've noticed that a bunch of the more vocal opponents online liked to pick actual scientific articles and quote small sections way out of context in order to support their "view". It's like using scientific articles for anti-science. That pull that shit repeatedly and piss people off, then report anyone who gets a bit to loud in their response. Seems a whole playbook these days

[–] [email protected] 11 points 1 year ago

This is the best summary I could come up with:

As Retraction Watch reports, Natural History Museum of Denmark myriapodologist Henrik Enghoff suspected the authors of the paper from China and Africa used OpenAI's ChatGPT to dig up academic references — and as it turns out, his hunch was right.

The offending paper was initially taken down by Preprints.org, a preprint archive run by the academic publisher MDPI, in June after Enghoff's colleague, the University of Copenhagen's David Richard Nash, notified editors of the errors.

Earlier this year, reporters at The Guardian noticed that the AI chatbot even made up entire articles with bylines of journalists who had never written these non-existent pieces.

"We will withdraw it immediately and add the authors of this preprint to our blacklist," Preprints.org's editor Lloyd Shu told Nash in an email back in June.

Kahsay Tadesse Mawcha of Aksum University in Ethiopia, who was originally listed as a corresponding author on the offending preprint, admitted to Danish newspaper Weekendavisen back in July that he indeed used ChatGPT, adding that he only realized later that the tool was "not recommended" for the task.

Powerful but flawed AI tools like ChatGPT are a bull in a china shop of almost every knowledge domain, academia included — and it'll be fascinating to watch everybody involved try to find a new sense of equilibrium.

The original article contains 547 words, the summary contains 214 words. Saved 61%. I'm a bot and I'm open source!

[–] [email protected] 6 points 1 year ago (1 children)

Aren't papers peer reviewed? Or are they getting ChatGPT to do that too?

Submit harsher consequences for falsified information?

[–] [email protected] 1 points 1 year ago

The offending paper was initially taken down by Preprints.org, a preprint archive run by the academic publisher MDPI

[–] [email protected] 2 points 1 year ago

This all sounds very House of Leaves.

[–] [email protected] -4 points 1 year ago (2 children)

Assuming this is carelessness, this just goes to show that working in academia isn't an indicator of critical thinking skills IMO

[–] [email protected] 8 points 1 year ago (1 children)

Your assumption is wrong. This was not carelessness. Academic dishonesty and lack of integrity is an ongoing issue in research. China is one of the biggest culprits for blatant plagiarism and IP theft, although recently even academics from Ivy league universities have been implicated in fraudulent publications. The simple fact is that number of publications is the main metric used in academia for hiring and promotion. This leads to a perverse incentive model where academics prioritise publishing over conducting good science, thus all we get is a shit load of noise (poor articles) that obscure the signal (good articles).

[–] [email protected] -3 points 1 year ago* (last edited 1 year ago) (1 children)

China is one of the biggest culprits for blatant plagiarism and IP theft, although recently even academics from Ivy league universities have been implicated in fraudulent publications.

Sure, let's make this about China when 4 out of 5 of the authors credited for the original article are from Africa.

While only one of which was from China. This doesn't even address the fact that the republished paper came from Mawcha which describes a study on millipedes in... Africa. Guess what, Wenxiang Yang wasn't even credited in this version. Was your reply carelessness or dishonesty and lack of integrity? I don't care where the misinformation and carelessness comes from as long as we're making efforts to stop it, but this is highly ironic.

[–] [email protected] 2 points 1 year ago (1 children)

In academic publishing you look at the order of authors and the author contribution statement to determine the hierarchy of the research group. In this case the Chinese author is the most senior, and was the member who approved the submission. In such niche areas as this most senior academics will know most of the relevant authors and literature. Thus carelessness is too kind a word where negligence and lack of integrity would be more fitting.

Further, with regards to the primary author my assertion still stands, it was not carelessness but rather brazen academic misconduct, as demonstrated by the resubmission (not republication as you suggest).

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

FWIW, last author is not automatically most senior. That is the way some fields do it, but others do it strictly by amount contributed to the paper. I have been both first and last author on different papers during my first post-doc.

[–] [email protected] 4 points 1 year ago (2 children)

Honestly, I bet he has the skills, he just didn't use them because he didn't care, or is overworked, or for whatever reason.

[–] Kerfuffle 6 points 1 year ago (1 children)

A lot of people don't understand the limitations/weaknesses of AI. The carelessness was probably more in not actually learning about the tool he was relying on (and just assuming it was reliable information).

[–] [email protected] 2 points 1 year ago

It's like the aeroplane lawyer case some time ago. People treat the computer as an arbiter of truth, and/or think checking is just asking the chatbot "Did you use a real citation for this?".

[–] [email protected] 4 points 1 year ago

You make a valid point, and there are certainly more considerations than my original reply would lead one to believe. Cheers.