OpenAI undecided on text watermarking release

Has been sitting on months on a highly accurate solution.

Paul Mah

10 Aug 2024 — 1 min read

Photo Credit: Unsplash/Veronica Benavides

OpenAI has created a highly accurate text watermarking solution. However, it's hesitating to release it.

OpenAI has confirmed the development of a text watermarking system that's 99.9% accurate, after an exposé by the Wall Street Journal.

Despite being ready for months, OpenAI is still of two minds about releasing it.

Did you write it yourself?

The technology works by adjusting how the model predicts the most likely words and phrases, creating a detectable pattern that OpenAI can validate.

What we know about it:

Resistant to paraphrasing.
Unnoticeable to humans
Low false positive rate.

Trivial to overcome

The ability to detect AI-generated text could be a boon for teachers trying to identify AI-written assignments.

But despite being resistant to manual tempering, it is trivial to overcome with techniques such as:

Using another LLM to reword it.
Putting it through a translation system.
Can be defeated using special instructions.

Still debating

OpenAI says it is "still debating" whether to release its text watermarking system due to several concerns.

Stigmatise use of AI for non-native users.
Can be trivial to overcome.

I think the key reason is probably more mundane: Users might switch to a rival LLM so they won't get caught using AI.

Humans required

A new study published two weeks ago once again concluded that AI models will eventually collapse and spew gibberish when trained recursively on AI-generated data.

That's likely why OpenAI continued developed its text watermarking solution after initially scrapping it, and indicates that AI firms have a vested interest in detecting AI-generated works.

On a happy note, this means that human-created content is still needed to train better AI models. So, we might all get to keep our jobs – for the time being.