Voice assistant with Python

Here, I’ll be writing about, how you can create and modify your own voice assistant using Python and some of its modules.

First you need to install some modules required for making this code execute successfully!

The Required modules for Voice Assistant in Python

speech_recognition
webbrowser
urllib

Description Of the code:

  1. First, you need to create a Python file named, “audio_to_text.py”. Now edit this file to convert only your speech to text and all the other operations will be done on a separate Python file. This file must have the following lines of code:
    import speech_recognition as sr
    from operation_produced import *
    
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print('Say something:')
        audio = r.listen(source,timeout=2)
        print('Audio listened!')
    
    try:
        text = r.recognize_google(audio)
        print("Text: " + text)
        main(op_text=text)
    except:
        print('Audio not recognized!')
    

    The working of the above given code is extremely simple. Importing speech_recognition module first. Then import the other file that should have all the main operational contents.
    open the audio source (input) and use speech_recognition to listen and convert using “recognize_google” to text.
    Now pass the generated text to the main method of the other file (operation_produced) for further operation.

  2. Now create another python file with the same name as that is imported in the audio_to_text.py file. Here the Python file name should be “operation_produced.py”. Now, this file has several methods, let’s understand them one by one:
    import webbrowser
    from urllib.parse import urlencode
    
    def main(op_text):
        print('\noperation text:', op_text)
        op_text = op_text.lower()
        if op_text.startswith('visit') or op_text.startswith('show me') or op_text.startswith('open'):  # for visit INTERNET
            if ' ' in op_text:
                new_command = refine(op_text)
                predict_website(new_command)
            else:
                print('\nA Computer cannot visit anything by itself!\n')
    
    
    if __name__ == '__main__':
        text = 'visit '
        main(text)

    Now op_text contains the text you said. As per the specified commands, it will try to refine the visit/open/show me to get to you what you wanted to see…
    Now see the refine method to understand how it refines your command.

  3. The refine method, takes the command as string format and returns another string that contains only the command. For example: visit youtube will return youtube, show me time will return time:
    def refine(string):
        words = string.split(' ')
        if len(words) == 0:
            return ''
        else:
            if string.startswith('show me'):
                return ' '.join(words[2:]).strip(' ')
            else:
                return ' '.join(words[1:]).strip(' ')  # for open and visit command

    Now this returned string is sent to the predict_website method, where it will make the text searchable.

  4. The predict_website method after taking the command converts it to URL, using urllib's urlencode:
    def predict_website(command):
        print('visiting website>...', end=' ')
        command_orig = command
        command = command.lower().replace(' ', '')
        web_dict = {'youtube': 'youtube.com', 'facebook': 'facebook.com', 'codespeedy': 'codespeedy.com',
                    'quora': 'quora.com', 'amazon': 'amazon.in'}
        if command in web_dict.keys():
            website = f'https://www.{web_dict[command]}/'
            print(website)
            webbrowser.open_new(website)
        else:
            q = {'q': command_orig}
            query = urlencode(q)
            complete_url = "https://www.google.com/search?" + query
            print(complete_url)
            webbrowser.open_new(complete_url)

    if you say, visit codespeedy, it will get the value for the key ‘codespeedy’ and visit it.
    you can also say, visit Alan Walker faded, as there is no link mentioned for it, thus this needs to be encoded to an URL and this is done by URL encode. And webbrowser.open() opens your default web-browser to show you the results.

  5. Now if you integrate all methods, the operation_produced.py file should look like this:
    import webbrowser
    from urllib.parse import urlencode
    
    
    def refine(string):
        words = string.split(' ')
        if len(words) == 0:
            return ''
        else:
            if string.startswith('show me'):
                return ' '.join(words[2:]).strip(' ')
            else:
                return ' '.join(words[1:]).strip(' ')  # for open and visit command
    
    
    def predict_website(command):
        print('visiting website>...', end=' ')
        command_orig = command
        command = command.lower().replace(' ', '')
        web_dict = {'youtube': 'youtube.com', 'facebook': 'facebook.com', 'codespeedy': 'codespeedy.com',
                    'quora': 'quora.com', 'amazon': 'amazon.in'}
        if command in web_dict.keys():
            website = f'https://www.{web_dict[command]}/'
            print(website)
            webbrowser.open_new(website)
        else:
            q = {'q': command_orig}
            query = urlencode(q)
            complete_url = "https://www.google.com/search?" + query
            print(complete_url)
            webbrowser.open_new(complete_url)
    
    
    def main(op_text):
        print('\noperation text:', op_text)
        op_text = op_text.lower()
        if op_text.startswith('visit') or op_text.startswith('show me') or op_text.startswith('open'):  # for visit INTERNET
            if ' ' in op_text:
                new_command = refine(op_text)
                predict_website(new_command)
            else:
                print('\nA Computer cannot visit anything by itself!\n')
    
    
    if __name__ == '__main__':
        text = 'visit '  # default value for visiting
        main(text)
    

    Now save and put “audio_to_text.py” and “operation_produced.py” file in the same folder.

Finally, run "audio_to_text.py" file and say something like, “visit youtube, open codespeedy, visit amazon, open weather, show me time etc.

** Note that, speech_recognition module has some problem running in Linux, though many people can run it.
But on windows, it has to be installed with portaudio.h bindings, which is easily available on INTERNET.

Also read:

Leave a Reply

Your email address will not be published. Required fields are marked *