Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASR incorrectly received the TTS voice and repeating ... #379

Open
phoenixdna opened this issue Sep 27, 2024 · 0 comments
Open

ASR incorrectly received the TTS voice and repeating ... #379

phoenixdna opened this issue Sep 27, 2024 · 0 comments

Comments

@phoenixdna
Copy link

phoenixdna commented Sep 27, 2024

I found quite frustrating while I trying to use the TTS combine with azure 's ASR . For some reason, The TTS output was received by ASR incorrectly even if I mute the microphone with Pyaudio. So please someone help

The Code pieces

def text_to_speech(text):
    
    speech_config2 = speechsdk.SpeechConfig(subscription='xxx', region='eastasia')
    audio_config2 = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)

    # The neural multilingual voice can speak different languages based on the input text.
    speech_config2.speech_synthesis_voice_name='zh-CN-XiaoyiNeural'
    speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config2, audio_config=audio_config2)

    # Get text from the console and synthesize to the default speaker.
    

    global asr_active
    asr_active = False
    mute_microphone()
    #speech_recognizer.stop_continuous_recognition()
    time.sleep(1)
    print("《《《《《《《《《《《tts=>>>",text)
    speech_synthesis_result = speech_synthesizer.speak_text_async(text).get()
    print("tts=>》》》》》》》》》》》》》》》》》>>")

    time.sleep(1)
    asr_active = True
    unmute_microphone()

    #speech_recognizer.start_continuous_recognition()

System output

final response recived
model response: 很抱歉,我无法提供实时的日期信息。请您查看您的电子设备或询问您的语音助手来获取今天的日期。
microphone muted
《《《《《《《《《《《tts=>>> 很抱歉,我无法提供实时的日期信息。请您查看您的电子设备或询问您的语音助手来获取今天的日期。
tts=>》》》》》》》》》》》》》》》》》>>
Speech synthesized for text [很抱歉,我无法提供实时的日期信息。请您查看您的电子设备或询问您的语音助手来获取今天的日期。]
microphone UNmuted
RECOGNIZED: SpeechRecognitionEventArgs(session_id=cf14ed11f79a444faa988fab687687c0, result=SpeechRecognitionResult(result_id=96877e5119b74d2e9f7b394072741b87, text="很抱歉,我无法生气。", reason=ResultReason.RecognizedSpeech))
sent to model reached

Question:

As you can seen in the code, even I muted the microphone during speech_synthesizer.speak_text_async(text).get(), I still can see some of the words has been incorrectly received in speech_recognizer,

RECOGNIZED: SpeechRecognitionEventArgs(session_id=cf14ed11f79a444faa988fab687687c0, result=SpeechRecognitionResult(result_id=96877e5119b74d2e9f7b394072741b87, text="很抱歉,我无法生气。",

I don't know why ASR can recognize the TTS voice although I have muted microphone, actrually I also add the active flag in callback function of SpeechRecognizer.recognized, but no help..., so please help ,Thx in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant