Get voice input with microphone in Python using PyAudio and SpeechRecognition
In this Python tutorial, we will show you how to take voice input with microphone in Python using PyAudio and SpeechRecognition.
To do this task we require the following things installed on our machine.
- Python
- SpeechRecognition Package
- PyAudio
That’s it.
To learn how to install the packages you can get to know from here install essential packages to work with microphone in Python
Also learn,
And one more thing you have to keep in mind that here we are going to work with microphone thus you must need to know the device ID of your audio input device.
Because you have to tell your Python program that you want to take speech input or voice input from which particular microphone.
If you still don’t know how to find the device ID please read my previous tutorial,
Find all the microphone names and device index in Python using PyAudio
The above tutorial will help you to learn all the things you need to set before you start working with this tutorial.
Now we assume that you are all set.
Take voice input from the user in Python using PyAudio – speech_recognizer
What we gonna do in simple steps:
- Take input from the mic
- Convert the voice or speech to text
- Store the text in a variable/or you can directly take it as user input
There are several API available online for speech recognition or you can say voice to text.
import speech_recognition as s_r print(s_r.__version__)
Output:
3.8.1
It will print the current version of your speech recognition package.
If everything is fine then go to the next part.
Set microphone to accept sound
my_mic = s_r.Microphone()
Here you have to pass the parameter device_index=?
To know your device index follow the tutorial: Find all the microphone names and device index in Python using PyAudio
To recognize input from the microphone you have to use a recognizer class. Let’s just create one.
r = s_r.Recognizer()
So our program will be like this till now:
import speech_recognition as s_r print(s_r.__version__) # just to print the version not required r = s_r.Recognizer() my_mic = s_r.Microphone(device_index=1) #my device index is 1, you have to put your device index
Don’t try to run this program. We have left things to do.
Now we have to capture audio from microphone. To do that we can use the below code:
with my_mic as source: print("Say now!!!!") audio = r.listen(source)
Now the final step to convert the sound taken from the microphone into text.
Convert the sound or speech into text in Python
To convert using Google speech recognition we can use the following line:
r.recognize_google(audio)
It will return a string with some texts. ( It will convert your voice to texts and return that as a string.
You can simply print it using the below line:
print(r.recognize_google(audio))
Now the full program will look like this:
import speech_recognition as s_r print(s_r.__version__) # just to print the version not required r = s_r.Recognizer() my_mic = s_r.Microphone(device_index=1) #my device index is 1, you have to put your device index with my_mic as source: print("Say now!!!!") audio = r.listen(source) #take voice input from the microphone print(r.recognize_google(audio)) #to print voice into text
If you run this you should get an output.
But after waiting a few moments if you don’t get any output, check your internet connection. This program requires internet connection.
If your internet is alright but you still are not getting any output that means your microphone is getting noise.
Just press ctrl+c and hit enter to stop the current execution.
Now you have to reduce noise from your input.
How to do that?
r.adjust_for_ambient_noise(source)
This will be helpful for you.
Now the final program will be like this:
It should successfully work:
import speech_recognition as s_r print(s_r.__version__) # just to print the version not required r = s_r.Recognizer() my_mic = s_r.Microphone(device_index=1) #my device index is 1, you have to put your device index with my_mic as source: print("Say now!!!!") r.adjust_for_ambient_noise(source) #reduce noise audio = r.listen(source) #take voice input from the microphone print(r.recognize_google(audio)) #to print voice into text
Output:
Will print whatever you say!!
You can store the string to any variable if you want. But remember r.recognize_google(audio) this will return string. So careful while working with datatypes.
my_string = r.recognize_google(audio)
You can use this to store your speech in a variable.
Do comment if you need any further help or any suggestion to make it better.
I’m getting this error. I followed the same tutorial.
OSError: [Errno -9998] Invalid number of channels
The problem is not with the code. Pyaudio is not working properly for channel issues. Kindly search the error code on the internet.
this code is just awesome it worked for me and i used it with wikipedia module for voice search
my code is working but it cannot recognize what am i saying
Same thing happening to me…what to do, have you got the solution buddy.
Traceback (most recent call last):
File “C:/Users/Anonymous/Desktop/lr1.py”, line 9, in
print(r.recognize_google(audio)) #to print voice into text
File “C:\Users\Anonymous\AppData\Local\Programs\Python\Python38\lib\site-packages\speech_recognition\__init__.py”, line 858, in recognize_google
if not isinstance(actual_result, dict) or len(actual_result.get(“alternative”, [])) == 0: raise UnknownValueError()
speech_recognition.UnknownValueError
my mic had a button that automatically mutes me, for me i got that error because that button was on mute
Could i make an MP3 out of my audio if so, how ?
How can I recognize audio from a google meet?
Like If a person says hello, speech recognition should return hello.
not from my mic, from the speakers mic.
It will be importing os. os.run(f’say {text_var}’)
How can I know my device index for the microphone ?
Kindly read this tutorial properly, as we have already inserted a tutorial link for finding devices and their index.
How to make addition loop that work continue addition from voice .
Traceback (most recent call last):
File “D:\Languages\PYTHON\JARVIS\Example.py”, line 5, in
with my_mic as source:
File “C:\Users\mauli\AppData\Local\Programs\Python\Python39\lib\site-packages\speech_recognition\__init__.py”, line 138, in __enter__
self.audio.open(
File “C:\Users\mauli\AppData\Local\Programs\Python\Python39\lib\site-packages\pyaudio.py”, line 750, in open
stream = Stream(self, *args, **kwargs)
File “C:\Users\mauli\AppData\Local\Programs\Python\Python39\lib\site-packages\pyaudio.py”, line 441, in __init__
self._stream = pa.open(**arguments)
OSError: [Errno -9999] Unanticipated host error
I’m getting some errors while following this tutorial.
errors>>
PS C:\Windows\system32> & “C:/Program Files/Python37/python.exe” c:/Windows/system32/jarvis.py
3.8.1
Traceback (most recent call last):
File “C:\Program Files\Python37\lib\site-packages\speech_recognition\__init__.py”, line 108, in get_pyaudio
import pyaudio
ModuleNotFoundError: No module named ‘pyaudio’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “c:/Windows/system32/jarvis.py”, line 79, in
allMethodsCallingFunction()
File “c:/Windows/system32/jarvis.py”, line 17, in allMethodsCallingFunction
takeCommand()
File “c:/Windows/system32/jarvis.py”, line 70, in takeCommand
my_mic = s_r.Microphone(device_index=1) #my device index is 1, you have to put your device index
File “C:\Program Files\Python37\lib\site-packages\speech_recognition\__init__.py”, line 79, in __init__
self.pyaudio_module = self.get_pyaudio()
File “C:\Program Files\Python37\lib\site-packages\speech_recognition\__init__.py”, line 110, in get_pyaudio
raise AttributeError(“Could not find PyAudio; check installation”)
AttributeError: Could not find PyAudio; check installation
Please help me with that.
Thankyou
(In Advance)
This code work very good thanks for this code every thing is done import speech_recognition as s_r
print(s_r.__version__) # just to print the version not required
r = s_r.Recognizer()
my_mic = s_r.Microphone(device_index=1) #my device index is 1, you have to put your device index
with my_mic as source:
print(“Say now!!!!”)
r.adjust_for_ambient_noise(source) #reduce noise
audio = r.listen(source) #take voice input from the microphone
print(r.recognize_google(audio)) #to print voice into text
How can i use input source as Speaker instead of microphone.
Is it possible to do the same without internet ?
I’m getting this error
Output exceeds the size limit. Open the full output data in a text editor
—————————————————————————
OSError Traceback (most recent call last)
d:\charts\chat_voice.ipynb Cell 9 in ()
1 import speech_recognition as sr
—-> 2 from speech_recognition.__main__ import r
3 def ask(text):
4 flag=True
File c:\Users\DELL\AppData\Local\Programs\Python\Python310\lib\site-packages\speech_recognition\__main__.py:8, in
6 try:
7 print(“A moment of silence, please…”)
—-> 8 with m as source: r.adjust_for_ambient_noise(source)
9 print(“Set minimum energy threshold to {}”.format(r.energy_threshold))
10 while True:
File c:\Users\DELL\AppData\Local\Programs\Python\Python310\lib\site-packages\speech_recognition\__init__.py:138, in Microphone.__enter__(self)
135 self.audio = self.pyaudio_module.PyAudio()
136 try:
137 self.stream = Microphone.MicrophoneStream(
–> 138 self.audio.open(
139 input_device_index=self.device_index, channels=1,
140 format=self.format, rate=self.SAMPLE_RATE, frames_per_buffer=self.CHUNK,
141 input=True, # stream is an input stream
142 )
143 )
…
–> 445 self._stream = pa.open(**arguments)
447 self._input_latency = self._stream.inputLatency
448 self._output_latency = self._stream.outputLatency
OSError: [Errno -9999] Unanticipated host error