OpenAI has been working on a system for watermarking text created with ChatGPT and a tool for detecting watermarks for about a year. Reports The Wall Street JournalBut there is internal division over whether to make it public: on the one hand, it seems like the responsible thing to do, but on the other, it could have a negative impact on the company’s bottom line.
OpenAI’s watermarking is described as tailoring how the model predicts which words or phrases are most likely to follow a previous one, creating detectable patterns (this is a simplification, but for more information, check out Google’s more detailed explanation of Gemini text watermarking).
Providing a way to detect AI-written material could be a potential boon for teachers trying to stop students from handing writing assignments over to AI. journal The company reports that it found that the watermarking did not affect the quality of the chatbot’s text output, and a survey it commissioned found that “people around the world supported the idea of ​​AI detection tools by a 4-to-1 margin.” journal writing.
later journal OpenAI published a post confirming that they are working on text watermarking. Today’s blog post update That is discoverer TechCrunch. In it, the company said its method is highly accurate (“99.9% effective” and journal The company says the AI ​​tool is resistant to “saw”-like “manipulation” and is “resistant to rephrasing, etc.” But it says “bad actors can easily circumvent it” by using techniques such as rephrasing with a different model. The company also says it’s concerned about bias, such as how useful the AI ​​tool would be for non-native speakers.
But OpenAI also seems concerned that the use of watermarking might discourage engagement among ChatGPT users surveyed, with roughly 30% of those surveyed telling the company that they would use the software less if watermarking was implemented.
Nonetheless, some employees reportedly still feel that the watermarks are effective. However, given the persistent feelings of users, journal The company said some people have suggested trying a method that is “less controversial among users, but unproven.” In a blog post updated today, the company said it is in the “early stages” of exploring embedding metadata. It’s still “too early” to know how well it will work, but because it’s cryptographically signed, there will be no false positives.