A few months ago, my doctor showed off the AI transcription tool he uses to record and summarize patient interviews. In my case, the summary was fine, but what the researcher cited was: ABC News OpenAI’s Whisper, which powers tools used in many hospitals, finds that’s not necessarily the case. Sometimes they’re just completely making it up.
Whisper is used by businesses called nabla estimates that medical transcription tools have transcribed 7 million medical conversations. ABC News. More than 30,000 clinicians and 40 health systems use it, the paper writes. Nabla is aware that Whisper can cause hallucinations and is reportedly “working on the issue.”
A group of researchers from Cornell University, the University of Washington, etc. found in research Whisper hallucinated in about 1 percent of the transcriptions, occasionally composing entire sentences with violent emotions or gibberish phrases during silences during the recording. The researchers, who collected audio samples from TalkBank’s AphasiaBank as part of their study, noted that silences are especially common when people with a language disorder called aphasia speak.
One of the researchers, Alison Konecke of Cornell University, posted the following example in an article: Research thread.
Researchers found that the hallucinations also included fabricated medical conditions and phrases you might expect from a YouTube video, such as “Thank you for watching!” (OpenAI reportedly transcribed over 1 million hours of YouTube videos to train GPT-4.)
The research is announced in June At the Association for Computing Machinery FAccT Conference in Brazil. It is unclear whether peer review took place.
OpenAI spokesperson Taya Christianson emailed a statement to: The Verge:
We take this issue seriously and are continually working on improvements, including reducing hallucinations. Regarding the use of Whisper in our API platform, our usage policy prohibits its use in certain high-risk decision-making situations, and our model card for open source usage includes high-risk domains. Contains recommendations for use in. We would like to thank the researchers for sharing their findings.