Complete silence is always hallucinated as "ترجمة نانسي قنقر" in Arabic which translates as "Translation by Nancy Qunqar" #2608

puthre · 2025-06-13T14:44:28Z

puthre
Jun 13, 2025

If you generate complete silence in a wav file and run whisper on it, it will always hallucinate the same thing

ffmpeg -f lavfi -i anullsrc=r=44100:cl=stereo -t 30 silence.wav

whisper ./silence.wav --language Arabic --model large-v3
[00:00.000 --> 00:29.980] ترجمة نانسي قنقر

It seems that the model learned to interpret silence as ترجمة نانسي قنقر in arabic
Any way to fix / circumvent this?

misutoneko · 2025-06-14T10:58:17Z

misutoneko
Jun 14, 2025

VAD, probably.
I've only tried the turbo one, but what I can say is that v3 is different from the earlier models.
It looks like it doesn't have the audio descriptions to fall back on and produces hallucinations instead.

The earlier models will also produce some miscellaneous crap when they encounter silence
(they do this regardless of language), but there are more options for how to deal with that.

For example, these things can be effective for the small model (but not for v3):

the suppress_tokens trick
setting initial prompt to something like "."
adjusting logprob_threshold to -0.4 (works for this empty audio, probably not good for general use)

0 replies

Navanit-git · 2025-07-08T04:51:39Z

Navanit-git
Jul 8, 2025

is there any good arabic model you guys found which is better than large v3 ?
@misutoneko @puthre

1 reply

moadel321 Jul 22, 2025

Voxtral was released a few days ago and looks promising

rjb729951 · 2025-07-17T12:08:21Z

rjb729951
Jul 17, 2025

I found a similar thing happens in German where it says
"Untertitelung des ZDF für funk, 2017."

For both German and Arabic I found that this pretty much only happens at the very end of videos / when there is sustained silence.

0 replies

KillerX · 2025-07-22T06:52:14Z

KillerX
Jul 22, 2025

Essentially this seems to be an artifact of the fact that Whisper was trained on (amongst other things) YouTube audio + available subtitles. Often subtitlers add their copyright notice onto the end of the subtitles, and the end of the videos are often credits with music, applause, or silence. Thus whisper learned that silence == "copyright notice".

See some research for the Norwegian example here:

https://medium.com/@lehandreassen/who-is-nicolai-winther-985409568201

0 replies

qpwo · 2025-07-22T06:56:22Z

qpwo
Jul 22, 2025

In English there is always applause

0 replies

iodize6399 · 2025-07-22T07:21:46Z

iodize6399
Jul 22, 2025

this also happens when you don't speak into the voice mode, the transcript usually results in the same Arabic phrase

0 replies

dharmab · 2025-07-22T07:24:06Z

dharmab
Jul 22, 2025

I've also seen this happen a lot in English with Skyeye:

It also happens a lot with hallucinations saying stuff like "This is the end of the video, remember to like and subscribe"

0 replies

AhmedGMurtaza · 2025-07-22T07:27:47Z

AhmedGMurtaza
Jul 22, 2025

1 reply

nyxiereal Jul 22, 2025

Ok? This doesn't have anything to do with the topic of this discussion

ei23fxg · 2025-07-22T09:07:05Z

ei23fxg
Jul 22, 2025

In german it's "Vielen Dank" (Thank you very much)

0 replies

lloydjatkinson · 2025-07-22T09:40:10Z

lloydjatkinson
Jul 22, 2025

This has been a problem since at least February 2024: https://x.com/SheriefFYI/status/1756694995241951398

0 replies

alentodorov · 2025-07-22T11:11:13Z

alentodorov
Jul 22, 2025

in romanian, i’ve noticed multiple instances where the transcripts ends with “nu uitati sa da-ti like si subscribe” which, as you might easily infer , translates to “don’t forget to like and subscribe”.

1 reply

andrewmccafferty Jul 22, 2025

taf2 · 2025-07-22T13:57:08Z

taf2
Jul 22, 2025

Interesting google translates this into "Translated by Nancy Kangar"

1 reply

rany2 Jul 22, 2025

It gets it right if you set the source language to Arabic.

abdussamadbello · 2025-07-22T15:08:44Z

abdussamadbello
Jul 22, 2025

You can either finetune the model or filter the response from whisper

text = "helo helo hello ."
target_phrase = "ترجمة نانسي قنقر"
replacement = ""

updated_text = text. Replace(target_phrase, replacement)

print(updated_text)

0 replies

sherief · 2025-07-22T15:38:07Z

sherief
Jul 22, 2025

ChatGPT voice mode is also affected by this fwiw: https://x.com/SheriefFYI/status/1929129956153377144

1 reply

abdussamadbello Jul 22, 2025

Other languages don't get as much support as English during the data annotation and fine-tuning stages of most models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complete silence is always hallucinated as "ترجمة نانسي قنقر" in Arabic which translates as "Translation by Nancy Qunqar" #2608

{{title}}

Replies: 14 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Complete silence is always hallucinated as "ترجمة نانسي قنقر" in Arabic which translates as "Translation by Nancy Qunqar" #2608

Replies: 14 comments · 5 replies

Replies: 14 comments 5 replies