top of page
  • Writer's picturevP

Day 35 - Speech Recognition with Python

Hello Friends! Welcome back to the #PythonForDevOps series! On Day 35, we're going to look into Speech Recognition with Python. If you've ever dreamt of making your applications listen and respond to spoken words, this is the day for you.


Why Speech Recognition?

Speech recognition opens up a whole new realm of possibilities for your projects. Imagine building voice-controlled interfaces, automating tasks with voice commands, or transcribing spoken words into text. Python, with its versatility, makes diving into speech recognition surprisingly straightforward.


Setting Up the Environment

Before we start, make sure you have the necessary libraries installed. We'll be using the SpeechRecognition library, which conveniently wraps various well-known speech recognition engines.

pip install SpeechRecognition

Additionally, you might need to install the required dependencies for your operating system. For instance, on Ubuntu, you can install them using:

sudo apt-get install portaudio19-dev
pip install pyaudio

Basic Speech Recognition

Let's kick things off with a simple example. We'll create a Python script that listens to your microphone and prints out what it hears.

import speech_recognition as sr
def recognize_speech():
    recognizer = sr.Recognizer()
    with sr.Microphone() as source:
        print("Say something:")
        audio = recognizer.listen(source)
    try:
        text = recognizer.recognize_google(audio)
        print(f"You said: {text}")
    except sr.UnknownValueError:
        print("Sorry, could not understand audio.")
    except sr.RequestError as e:
        print(f"Could not request results; {e}")
if name == "__main__":
    recognize_speech()

Run this script, speak into your microphone, and witness your words magically transformed into text. It's like having a conversation with your computer!


Customizing Recognition

The example above uses Google's speech recognition engine by default. However, SpeechRecognition supports multiple engines. Let's modify our script to use the CMU Sphinx engine, which works offline.

import speech_recognition as sr
def recognize_speech():
    recognizer = sr.Recognizer()
    with sr.Microphone() as source:
        print("Say something:")
        audio = recognizer.listen(source)
    try:
        text = recognizer.recognize_sphinx(audio)
        print(f"You said: {text}")
    except sr.UnknownValueError:
        print("Sorry, could not understand audio.")
    except sr.RequestError as e:
        print(f"Could not request results; {e}")
if name == "__main__":
    recognize_speech()

This modification allows you to perform speech recognition without an internet connection, offering more flexibility in various scenarios.


Advanced Speech Recognition

Beyond simple recognition, you can explore more advanced features like keyword spotting and language customization. For instance, let's create a script that detects specific keywords.

import speech_recognition as sr
def detect_keyword(keyword):
    recognizer = sr.Recognizer()
    with sr.Microphone() as source:
        print("Say something:")
        audio = recognizer.listen(source)
    try:
        text = recognizer.recognize_google(audio)
        if keyword.lower() in text.lower():
            print(f"Detected keyword: {keyword}")
        else:
            print("Keyword not detected.")
    except sr.UnknownValueError:
        print("Sorry, could not understand audio.")
    except sr.RequestError as e:
        print(f"Could not request results; {e}")
if name == "__main__":
    detect_keyword("Python")

This script listens for the word "Python" and notifies you when it's detected. You can tailor this to recognize any specific term relevant to your application.


Speech recognition in Python isn't just a futuristic concept; it's something you can integrate into your projects today. Whether you're building voice-controlled assistants or automating tasks with spoken commands, the possibilities are vast.


Take some time to experiment with the examples provided, and don't hesitate to explore the SpeechRecognition documentation for more options. As we move forward in our PythonForDevOps journey, remember that understanding and implementing speech recognition can add a unique and interactive dimension to your applications.


That concludes Day 35 of our series.


Thank you for reading!


*** Explore | Share | Grow ***


31 views0 comments

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page