Voice Command Calculator in Python using speech recognition and PyAudio

Post Views: 4,919

Here we are going to build our own voice command calculator in Python. So what is a voice command calculator? The name itself is the answer to our question. A calculator calculates operands with the operator. But here we are not gonna take input from the user with the keyboard. We will take input from the user’s voice. For example,

9 + 8 = 17

We can make a calculator using a Python program easily. Just take inputs from the user and print the result.

But here we need to work with speech recognition.

If you face any error check the last updated code once: (Placed at the bottom of the article)

Python Voice Command Calculator

Our goal is like this:

If a user says “nine plus eight” the output will be like this:

9 + 8
17

If a user says “nine divided three” the output will be:

9 divided 3
3.0

Again, if the user says “eight multiplied by seven” the output will be:

8 x 7
56

And so on.

Steps to follow to build a voice command calculator in Python:

Here is the logic:

At first, we will set our microphone device.
Accept voice from the user with the mic.
Remove noise and distortion from the speech.
Convert the speech or voice to text.
Now store the text as a string in a variable.
Print the string if you wish. ( Not necessary but it will help you determine if the text is all right or not )
split the string into three parts:
first operand, operator and the second operand
Now convert the operands to integers.
Finally, do the calculation in your program as you got all the things you need.

Let’s implement it in Python:

Requirements to build speech/voice calculator:

We need the following:

SpeechRecognition
PyAudio

Set up things to start our program

You can install those with pip:

pip install SpeechRecognition
pip install pyaudio

If you are using Mac then you will need to install postaudio and pyaudio both.

brew install portaudio
pip install pyaudio
brew install flac

Linux users can simply download it using:

$ sudo apt-get install python-pyaudio python3-pyaudio

One more thing you must need to know:

Your mic device index.

To learn how to find mic device index follow: Find all the microphone names and device index in Python using PyAudio

Now you are ready to jump into the coding part.

To check if you are all set, your packages are installed successfully just try this below code:

import speech_recognition as sr
print("Your speech_recognition version is: "+sr.__version__)

Output:

Your speech_recognition version is: 3.8.1

If this runs with no errors then go to the next part.

In my previous tutorial, I have explained Get voice input with microphone in Python using PyAudio and SpeechRecognition
So in this tutorial, I will not explain those things again. I will only focus on our voice calculator. If you need to know the full explanation just follow my previous tutorial. Here I will provide the code.

Python code to get the voice command from the user:

import speech_recognition as s_r
print("Your speech_recognition version is: "+s_r.__version__)
r = s_r.Recognizer()
my_mic_device = s_r.Microphone(device_index=1)
with my_mic_device as source:
    print("Say what you want to calculate, example: 3 plus 3")
    r.adjust_for_ambient_noise(source)
    audio = r.listen(source)
my_string=r.recognize_google(audio)
print(my_string)

Run the program and it will print whatever you say.

The fun is that. If you say “nine plus ten” it will return a string ” 9 + 10 ”

Note that:

r.adjust_for_ambient_noise(source)

The above line is used to remove the reduce the noise.

r.recognize_google(audio) – This will return the converted text from voice as a string.

You will need an active internet connection to run this program.

( I am using google speech recognition, as right now it is free and we can send the unlimited request. )

But if you are going to create a project or do something bigger with it you should use google cloud speech. Because google speech recognition is running right now for free of cost. But Google does not assure us that the service will never stop.

If everything is fine till now you can go for the next step.

Split the string and make operation:

Here we face the main difficulty. We got a string. For example, “103 – 15”. This is a string so we can’t simply do operation on it. We need to split up the string into three parts and then we will get three separate string.

“103”,”-“,”15”

We need to convert “103” and “15” to int. Those are our operands. And the “+” is our operator.

Use the operator module. This will make our task easy.

import operator
def get_operator_fn(op):
    return {
        '+' : operator.add,
        '-' : operator.sub,
        'x' : operator.mul,
        'divided' :operator.__truediv__,
        'Mod' : operator.mod,
        'mod' : operator.mod,
        '^' : operator.xor,
        }[op]

def eval_binary_expr(op1, oper, op2):
    op1,op2 = int(op1), int(op2)
    return get_operator_fn(oper)(op1, op2)

print(eval_binary_expr(*(my_string.split())))

The sign we wrote in our programs:

+, -, x, divided, etc are operators.

For each operator, we have mentioned a particular method. As you can see, for “divided” => operator.__truediv__,

for Mod or mod ( as during speech to text conversion sometimes it returns capital letter for the first character ) => operator.mod
You can set you own commands too if you wish.

 return get_operator_fn(oper)(op1, op2)

This will calculate your result.

So here is the full code of this voice command calculator in Python:

import operator
import speech_recognition as s_r
print("Your speech_recognition version is: "+s_r.__version__)
r = s_r.Recognizer()
my_mic_device = s_r.Microphone(device_index=1)
with my_mic_device as source:
    print("Say what you want to calculate, example: 3 plus 3")
    r.adjust_for_ambient_noise(source)
    audio = r.listen(source)
my_string=r.recognize_google(audio)
print(my_string)
def get_operator_fn(op):
    return {
        '+' : operator.add,
        '-' : operator.sub,
        'x' : operator.mul,
        'divided' :operator.__truediv__,
        'Mod' : operator.mod,
        'mod' : operator.mod,
        '^' : operator.xor,
        }[op]

def eval_binary_expr(op1, oper, op2):
    op1,op2 = int(op1), int(op2)
    return get_operator_fn(oper)(op1, op2)

print(eval_binary_expr(*(my_string.split())))

Output:

Your speech_recognition version is: 3.8.1
Say what you want to calculate, example: 3 plus 3
11 + 12
23

To make multiplication simply say ” number1 multiplied by number2″

Here is a screenshot:

voice command calculator in python

for example, say ” 16 multiplied by 10 ”

Multiplied by will be automatically converted to “x” by Google’s speech recognition.

To get mod just say, ” 17 mod 9 ” It will give you the result.

For division just say, “18 divided 7 ”

Here you can see I have not used divided by because google’s speech recognition will not convert that to “/” and we gonna split our strings into three parts. So if we give “number1 divided by number2” it can’t be split up into three parts. “number1”, “divided” “by” “number2” and 4 parts will give us an error because the function can accept only three parameters.

def eval_binary_expr(op1, oper, op2):

If you get check your converted string. I have used print(my_string) this to check if I got my desire string or not.

Voice command calculator updated code:

import operator
import speech_recognition as sr

def get_operator_fn(op):
    return {
        '+' : operator.add,
        '-' : operator.sub,
        'x' : operator.mul,
        'divided' : operator.truediv,
        'mod' : operator.mod,
        '**' : operator.pow,
        '/' : operator.truediv,
    }.get(op)

def eval_binary_expr(op1, oper, op2):
    try:
        op1, op2 = float(op1), float(op2)
        return get_operator_fn(oper)(op1, op2)
    except ValueError:
        return "Invalid input"

def calculate_from_speech():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("Say what you want to calculate, for example: 3 plus 3")
        r.adjust_for_ambient_noise(source)
        audio = r.listen(source)
        try:
            user_input = r.recognize_google(audio)
            print("You said:", user_input)
            parts = user_input.split()
            result = eval_binary_expr(*parts)
            if result is not None:
                print("Result:", result)
            else:
                print("Couldn't calculate the result.")
        except sr.UnknownValueError:
            print("Sorry, I could not understand what you said.")
        except sr.RequestError as e:
            print("Could not request results from Google Speech Recognition service; {0}".format(e))

calculate_from_speech()

If you are using Mac then install this too.

brew install flac

Run this with terminal.

Please note that:

My audio input ( microphone ) device index is 1. You have to put your device index in your program.

To learn how to find device index check this Find all the microphone names and device index in Python using PyAudio

30 responses to “Voice Command Calculator in Python using speech recognition and PyAudio”

Brian says:

April 24, 2019 at 2:49 am

Great!

Reply
Eric St-Laurent says:

April 24, 2019 at 3:42 am

Very cool!
Thank you for posting

Reply
Alhaji Turay says:

April 24, 2019 at 9:03 am

Good job guys

Reply
N. Hakak says:

May 14, 2019 at 8:57 am

Does the same code work for raspberry pi as well??

Reply
- Saruque Ahamed Mollick says:
  
  May 14, 2019 at 12:06 pm
  
  Yeah that will also work for Raspberry Pi too. The only thing you need to do is: set-up your microphone. Open the terminal window and run lusb. It will show you all the USB devices connected to your machine. That’s it. You can also set the volume of your mic device high. In order to do this you may run alsamixer. Hope these tips will help you to perform this voice operation on your Raspberry Pi.
  
  Reply
N. Hakak says:

June 3, 2019 at 2:07 pm

Thank u so much…
What if I want to create my calculator to evaluate an expression with more than two operands and operators… Eg
4*5+3

Reply
- Saruque Ahamed Mollick says:
  
  June 3, 2019 at 2:16 pm
  
  I was thinking the same as you before posting the content and I have got a good solution.
  You can take an optional parameter or argument.
  But the best way will be if you find a way to evolve string as a mathematical operation.
  And the easiest way is to do this in Python.
  eval(‘4*5+3’)
  
  I hope it will help you out.
  If you have further query please ask.
  
  Reply
Shubham Joshi says:

July 10, 2019 at 8:34 pm

Bro I need GUI for this app.

Reply
- Saruque Ahamed Mollick says:
  
  July 12, 2019 at 8:15 pm
  
  Click on the contact button and type your requirements and send it. We will get back to you.
  
  Reply
Sonam mehra says:

July 13, 2019 at 11:49 pm

My voice calculator is not calculating any operation can u just me out!!

Reply
- Saruque Ahamed Mollick says:
  
  July 13, 2019 at 11:52 pm
  
  Yes sure. Check your email. Just send the screenshot of your problem in reply to that email.
  
  Reply
Shaik sajida says:

July 18, 2019 at 11:48 am

The idea is very Good….. Could you Please share the demo…

Reply
- Saruque Ahamed Mollick says:
  
  July 18, 2019 at 12:35 pm
  
  What kind of demo do you want? GUI or Just a video demo of this voice command calculator
  
  Reply
Shaik sajida says:

July 18, 2019 at 7:02 pm

Video Demo for voice calculator …….

Reply
- Saruque Ahamed Mollick says:
  
  July 18, 2019 at 8:56 pm
  
  The code is provided with full tutorial thus you can test it on your machine. I hope it will help you out.
  
  Reply
Disha says:

July 21, 2019 at 1:44 pm

My voice calculator is not calculating any operation , plz help me

Reply
- Saruque Ahamed Mollick says:
  
  July 21, 2019 at 8:57 pm
  
  Are you getting any error? If you are not getting any error that means the server is not responding.
  
  Reply
Mirin says:

September 17, 2019 at 10:16 am

Thanks, your tutorial are awesome, even I. Looking for speech to text recognition, like how to setup all in python. Even, I tried with many link but these are less than less. So could you please help with your own tutorial and also email address, so can send my work like what I have done till time.

Reply
Wasif Ahmed says:

November 1, 2019 at 4:23 pm

can I run it in pycharm?

TypeError: eval_binary_expr() missing 1 required positional argument: ‘op2’

this error ocur

Reply
- Saruque Ahamed Mollick says:
  
  November 1, 2019 at 7:02 pm
  
  Kindly check if you have defined ‘op2’ properly or not.
  And yes you can obviously run it on any IDE. So Pycharm is also fine to run this program.
  
  Reply
Ishank Chopra says:

February 10, 2020 at 10:33 pm

Please explain above code briefly. You have use two different user defined function. That thing I’m not getting properly.

Reply
Shruti Mulmule says:

April 16, 2020 at 2:26 pm

Thanks for the tutorial!
My recent project is totally based on it
Can you please send the demo where voice can be heard?
Please do help me with it!

Reply
AKASH MAURYA says:

July 24, 2020 at 10:48 am

I think, in line no 6, you should write with s_r.Microphone() as source: instead of with my_mic_device as source: .
BY this, it is working fine in my system.

Reply
Alex says:

July 27, 2020 at 6:39 pm

Ha-ha it works only if you say ‘5 + 5’ or ‘5 x 5’, but if you say ‘5 + 5 + 5’ it prints error – “TypeError: eval_binary_expr() takes 3 positional arguments but 5 were given” 🙂

Reply
- David Okunola says:
  
  December 16, 2021 at 2:41 am
  
  yes it works for just 3 parameters I even tried eval() but that didn’t work
  
  Reply
Shruti Mulmule says:

October 15, 2020 at 6:26 pm

my code is throwing error as “my_string not defined” . Please help

Reply
- TheyWill says:
  
  July 28, 2021 at 11:07 pm
  
  write this on code:
  my_string = r.recognize_google()
  or see the code above carefully
  
  Reply
Danish says:

January 21, 2021 at 3:30 pm

It can be impliment on rasbberry pi please reply

Reply
- David Okunola says:
  
  December 16, 2021 at 2:43 am
  
  yes it can… all you need do is make sure you set your Microphone input
  
  Reply
S.R.S says:

August 18, 2021 at 5:37 pm

What to do if I want to perform this all without Internet connection ?

Reply