Voice Command Calculator in Python using speech recognition and PyAudio
Here we are going to build our own voice command calculator in Python. So what is a voice command calculator? The name itself is the answer to our question. A calculator calculates operands with the operator. But here we are not gonna take input from the user with the keyboard. We will take input from the user’s voice. For example,
9 + 8 = 17
We can make a calculator using a Python program easily. Just take inputs from the user and print the result.
But here we need to work with speech recognition.
Python Voice Command Calculator
Our goal is like this:
If a user says “nine plus eight” the output will be like this:
9 + 8 17
If a user says “nine divided three” the output will be:
9 divided 3 3.0
Again, if the user says “eight multiplied by seven” the output will be:
8 x 7 56
And so on.
Steps to follow to build a voice command calculator in Python:
Here is the logic:
- At first, we will set our microphone device.
- Accept voice from the user with the mic.
- Remove noise and distortion from the speech.
- Convert the speech or voice to text.
- Now store the text as a string in a variable.
- Print the string if you wish. ( Not necessary but it will help you determine if the text is all right or not )
- split the string into three parts:
first operand, operator and the second operand - Now convert the operands to integers.
- Finally, do the calculation in your program as you got all the things you need.
Let’s implement it in Python:
Requirements to build speech/voice calculator:
We need the following:
- SpeechRecognition
- PyAudio
Set up things to start our program
You can install those with pip:
pip install SpeechRecognition pip install pyaudio
If you are using Mac then you will need to install postaudio and pyaudio both.
brew install portaudio pip install pyaudio
Linux users can simply download it using:
$ sudo apt-get install python-pyaudio python3-pyaudio
One more thing you must need to know:
- Your mic device index.
To learn how to find mic device index follow: Find all the microphone names and device index in Python using PyAudio
Now you are ready to jump into the coding part.
To check if you are all set, your packages are installed successfully just try this below code:
import speech_recognition as sr print("Your speech_recognition version is: "+sr.__version__)
Output:
Your speech_recognition version is: 3.8.1
If this runs with no errors then go to the next part.
In my previous tutorial, I have explained Get voice input with microphone in Python using PyAudio and SpeechRecognition
So in this tutorial, I will not explain those things again. I will only focus on our voice calculator. If you need to know the full explanation just follow my previous tutorial. Here I will provide the code.
Python code to get the voice command from the user:
import speech_recognition as s_r print("Your speech_recognition version is: "+s_r.__version__) r = s_r.Recognizer() my_mic_device = s_r.Microphone(device_index=1) with my_mic_device as source: print("Say what you want to calculate, example: 3 plus 3") r.adjust_for_ambient_noise(source) audio = r.listen(source) my_string=r.recognize_google(audio) print(my_string)
Run the program and it will print whatever you say.
The fun is that. If you say “nine plus ten” it will return a string ” 9 + 10 ”
Note that:
r.adjust_for_ambient_noise(source)
The above line is used to remove the reduce the noise.
r.recognize_google(audio) – This will return the converted text from voice as a string.
You will need an active internet connection to run this program.
( I am using google speech recognition, as right now it is free and we can send the unlimited request. )
But if you are going to create a project or do something bigger with it you should use google cloud speech. Because google speech recognition is running right now for free of cost. But Google does not assure us that the service will never stop.
If everything is fine till now you can go for the next step.
Split the string and make operation:
Here we face the main difficulty. We got a string. For example, “103 – 15”. This is a string so we can’t simply do operation on it. We need to split up the string into three parts and then we will get three separate string.
“103”,”-“,”15”
We need to convert “103” and “15” to int. Those are our operands. And the “+” is our operator.
Use the operator module. This will make our task easy.
import operator def get_operator_fn(op): return { '+' : operator.add, '-' : operator.sub, 'x' : operator.mul, 'divided' :operator.__truediv__, 'Mod' : operator.mod, 'mod' : operator.mod, '^' : operator.xor, }[op] def eval_binary_expr(op1, oper, op2): op1,op2 = int(op1), int(op2) return get_operator_fn(oper)(op1, op2) print(eval_binary_expr(*(my_string.split())))
The sign we wrote in our programs:
+, -, x, divided, etc are operators.
For each operator, we have mentioned a particular method. As you can see, for “divided” => operator.__truediv__,
for Mod or mod ( as during speech to text conversion sometimes it returns capital letter for the first character ) => operator.mod
You can set you own commands too if you wish.
return get_operator_fn(oper)(op1, op2)
This will calculate your result.
So here is the full code of this voice command calculator in Python:
import operator import speech_recognition as s_r print("Your speech_recognition version is: "+s_r.__version__) r = s_r.Recognizer() my_mic_device = s_r.Microphone(device_index=1) with my_mic_device as source: print("Say what you want to calculate, example: 3 plus 3") r.adjust_for_ambient_noise(source) audio = r.listen(source) my_string=r.recognize_google(audio) print(my_string) def get_operator_fn(op): return { '+' : operator.add, '-' : operator.sub, 'x' : operator.mul, 'divided' :operator.__truediv__, 'Mod' : operator.mod, 'mod' : operator.mod, '^' : operator.xor, }[op] def eval_binary_expr(op1, oper, op2): op1,op2 = int(op1), int(op2) return get_operator_fn(oper)(op1, op2) print(eval_binary_expr(*(my_string.split())))
Output:
Your speech_recognition version is: 3.8.1 Say what you want to calculate, example: 3 plus 3 11 + 12 23
To make multiplication simply say ” number1 multiplied by number2″
Here is a screenshot:

voice command calculator in python
for example, say ” 16 multiplied by 10 ”
Multiplied by will be automatically converted to “x” by Google’s speech recognition.
To get mod just say, ” 17 mod 9 ” It will give you the result.
For division just say, “18 divided 7 ”
Here you can see I have not used divided by because google’s speech recognition will not convert that to “/” and we gonna split our strings into three parts. So if we give “number1 divided by number2” it can’t be split up into three parts. “number1”, “divided” “by” “number2” and 4 parts will give us an error because the function can accept only three parameters.
def eval_binary_expr(op1, oper, op2):
If you get check your converted string. I have used print(my_string) this to check if I got my desire string or not.
Please note that:
My audio input ( microphone ) device index is 1. You have to put your device index in your program.
To learn how to find device index check this Find all the microphone names and device index in Python using PyAudio
Great!
Very cool!
Thank you for posting
Good job guys
Does the same code work for raspberry pi as well??
Yeah that will also work for Raspberry Pi too. The only thing you need to do is: set-up your microphone. Open the terminal window and run lusb. It will show you all the USB devices connected to your machine. That’s it. You can also set the volume of your mic device high. In order to do this you may run alsamixer. Hope these tips will help you to perform this voice operation on your Raspberry Pi.
Thank u so much…
What if I want to create my calculator to evaluate an expression with more than two operands and operators… Eg
4*5+3
I was thinking the same as you before posting the content and I have got a good solution.
You can take an optional parameter or argument.
But the best way will be if you find a way to evolve string as a mathematical operation.
And the easiest way is to do this in Python.
eval(‘4*5+3’)
I hope it will help you out.
If you have further query please ask.
Bro I need GUI for this app.
Click on the contact button and type your requirements and send it. We will get back to you.
My voice calculator is not calculating any operation can u just me out!!
Yes sure. Check your email. Just send the screenshot of your problem in reply to that email.
The idea is very Good….. Could you Please share the demo…
What kind of demo do you want? GUI or Just a video demo of this voice command calculator
Video Demo for voice calculator …….
The code is provided with full tutorial thus you can test it on your machine. I hope it will help you out.
My voice calculator is not calculating any operation , plz help me
Are you getting any error? If you are not getting any error that means the server is not responding.
Thanks, your tutorial are awesome, even I. Looking for speech to text recognition, like how to setup all in python. Even, I tried with many link but these are less than less. So could you please help with your own tutorial and also email address, so can send my work like what I have done till time.
can I run it in pycharm?
TypeError: eval_binary_expr() missing 1 required positional argument: ‘op2’
this error ocur
Kindly check if you have defined ‘op2’ properly or not.
And yes you can obviously run it on any IDE. So Pycharm is also fine to run this program.
Please explain above code briefly. You have use two different user defined function. That thing I’m not getting properly.
Thanks for the tutorial!
My recent project is totally based on it
Can you please send the demo where voice can be heard?
Please do help me with it!
I think, in line no 6, you should write with s_r.Microphone() as source: instead of with my_mic_device as source: .
BY this, it is working fine in my system.
Ha-ha it works only if you say ‘5 + 5’ or ‘5 x 5’, but if you say ‘5 + 5 + 5’ it prints error – “TypeError: eval_binary_expr() takes 3 positional arguments but 5 were given” 🙂
yes it works for just 3 parameters I even tried eval() but that didn’t work
my code is throwing error as “my_string not defined” . Please help
write this on code:
my_string = r.recognize_google()
or see the code above carefully
It can be impliment on rasbberry pi please reply
yes it can… all you need do is make sure you set your Microphone input
What to do if I want to perform this all without Internet connection ?