speech recognization

目标：

通过麦克风接收声音，通过扬声器播放声音，想要让硬件机器人实现和人的正常对话。

当扬声器在播放声音的时候，禁用掉麦克风（让机器人变成聋子）

安装speech_recognition


python3 -m venv venv  # 创建python虚拟环境
source venv/bin/activate # 激活虚拟环境
pip install SpeechRecognition #按照操作麦克风的库

关键函数listen()


import speech_recognition as sr
import pyaudio
 
recognizer = sr.Recognizer()
 
while True:
    # 使用麦克风获取声音输入
    try:
        with sr.Microphone() as source:
            print("请说话...")
            audio = recognizer.listen(source, timeout=10)
    except sr.WaitTimeoutError:
        print("未检测到语音输入，跳过...")
        continue
 
    try:
        text = recognizer.recognize_google(audio)
        print("你说的是: " + text)
    except sr.UnknownValueError:
        print("无法识别语音")
    except sr.RequestError as e:
        print("请求出错; {0}".format(e))
 
    # 使用 PyAudio 播放录制的音频
    p = pyaudio.PyAudio()
    stream = p.open(format=p.get_format_from_width(audio.sample_width),
                    channels=1,  # 单声道
                    rate=audio.sample_rate,
                    output=True)
    stream.start_stream()
    stream.write(audio.frame_data)
    stream.stop_stream()
    stream.close()
    p.terminate()

解析参数：

def listen(self, source, timeout=None, phrase_time_limit=None, snowboy_configuration=None):

timeout：超时时间，如果静音时间超过这个时间，会跑出一个异常

phrase_time_limit：监听最大时间，声音持续超过这个时间，将会强制结束

snowboy_configuration：关键词唤醒

关键函数listen_in_background()

listen_in_backgound和listen的区别是，listen_in_backgound会单独启动一个线程，在后台持续监听

再次调用这个方法，将会关闭该线程

关键函数sr.Recognizer()

这个函数太关键了，可以让我们充分的操作麦克风


    def __init__(self):
        """
        Creates a new ``Recognizer`` instance, which represents a collection of speech recognition functionality.
        """
        self.energy_threshold = 1000  # minimum audio energy to consider for recording （麦克风的敏感度，官方推荐值50-4000，越低代表麦克风越灵敏）
        self.dynamic_energy_threshold = True（智能动态麦克风敏感度）
        self.dynamic_energy_adjustment_damping = 0.15（智能动态麦克风敏感度）
        self.dynamic_energy_ratio = 1.5（智能动态麦克风敏感度）
        self.pause_threshold = 0.2  # seconds of non-speaking audio before a phrase is considered complete（当声音消失0.2秒后认为这句话说完，对于listen_in_background来说，会调用回调函数做自己想做的逻辑处理）
        self.operation_timeout = None  # seconds after an internal operation (e.g., an API request) starts before it times out, or ``None`` for no timeout （无法理解）
        self.phrase_threshold = 0.3  # minimum seconds of speaking audio before we consider the speaking audio a phrase - values below this are ignored (for filtering out clicks and pops) （少于0.3秒当作没听到）
        self.non_speaking_duration = 0.1  # seconds of non-speaking audio to keep on both sides of the recording （在每次录音前后都加上0.1秒的空白静音时间）

解决最初问题：当扬声器在播放声音的时候，禁用掉麦克风（让机器人变成聋子）


def setup_mic(self):
        recorder = sr.Recognizer()
        self.continuous_listening = recorder.listen_in_background(
            self.source,
            self.listen_callback if self.listen_callback is not None else self.record_callback
        )
 
def stop_mic(self):
   self.continuous_listening()
   self.logger.info("Mic stoppedn listenering")

反思总结：

Python的包，有些关键的设置都不是直接在listen方法或者listen_in_background方法的参数中的，而是直接写在源代码里面，我们可以直接进入源代码进行修改，也可以在创建完这个类之后，自己再手动修改

相关阅读:
Python--练习：使用while循环求1..100的和
Android Studio 报错：AVD Pixel_3a_API_30_x86 is already running
关于asio2项目example目录中的几个tcp示例的说明
南京邮电大学统计学课程实验2 用EXCEL进行参数估计假设检验指导
雷达模拟器 HPx-310
FOC控制算法
神经网络原理与实例精解,神经网络计算机的组成
【JavaEE进阶序列 | 从小白到工程师】JavaEE中的二维数组详细介绍与应用
POJ1064Cable master题解
项目经理没有权力，怎么做好项目管理？

原文地址：https://blog.csdn.net/sunriseYJP/article/details/133784875