The ability of large language models (LLMs) to create text and images almost indistinguishable from those created by humans is disrupting, if not revolutionizing, countless fields of human activity. Yet the potential for misuse is already manifest, from academic plagiarism to the mass generation of misinformation.
This week, Sumanth Dathathri at DeepMind, Google’s AI research lab in London, and his colleagues report their test of a new approach to ‘watermarking’ AI-generated text by embedding a ‘statistical signature’, a form of digital identifier, that can be used to certify the text’s origin. The word watermark comes from the era of paper and print, and describes a variation in paper thickness, not usually immediately obvious to the naked eye, that does not change the printed text. A watermark in digitally generated text or images should be similarly invisible to the user — but immediately evident to specialized software.
Dathathri and his colleagues’ work represents an important milestone for digital-text watermarking. But there is still some way to go before companies and regulators will be able to confidently state whether a piece of text is the product of a human or a machine. Given the imperatives to reduce harm from AI, more researchers need to step up to ensure that watermarking technology fulfils its promise.
This is the official paper and the Nature News report related to the SynthID project. Basically, "watermarking" to make sure the model outputs can be easily recognized computationally... but it's not foolproof. Especially relevant in light of AI regulations. News article itself contains some interesting opinions
Paper: https://doi.org/10.1038/s41586-024-08025-4
GitHub repo: https://github.com/google-deepmind/synthid-text
And that's assuming people are using a model specifically designed with watermarking in the first place. In practice, this will only affect the absolute dumbest adversaries. It won't apply at all to open source or custom-built tools. Any additional step in a workflow is going to wash this right out either way.
My fear is that regulators will try to ban open models because the can't possibly control them. That wouldn't actually work, of course, but it might sound good enough for an election campaign, and I'm sure Microsoft and Google would dump a pile of cash on their doorstep for it.
I would be in favor of open models actively defaulting to watermarking by default if they could make unobtrusive. An opt out watermarking scheme honestly would handle most bad actors to me tbh.
Bad guys are normally not the brightest. Yes even big expensive bad actors.
Making it the norm also makes it a red flag if content has other markers but seems intentionally obfuscated.