How to Achieve Parallel Processing in Python

In This Article/Tutorial we are going to learn how to perform parallel processing in Python using functional programming. Lets first know what is Parallel Processing.

Parallel Processing:- Parallel Processing is the style of program operation in which the Processes are divided into different parts and are executed simultaneously in different processors attached inside of the same computer.
parallel programming takes the existing code and makes it parallel so that it runs on all of the CPU cores at the same time

We will use immutable data structure using functional programming principles & multiprocessing module in a simple example and also use a time function for applying a delay so that we would be able to apply multiprocessing and achieve the parallel processing for the code

Achieving Parallel Processing using multiprocessing in Python

Let’s start with the example b creating a data structure named actor

  1. Importing the required packages and modules
import collections
import time
import os
import multiprocessing
from pprint import pprint

2. Let’s create an immutable data structure using a collection module

Actor = collections.namedtuple('Actor',[
    'name',
    'born', 
    'oscar',
])
actors = (
     Actor(name ='leonardo dicaprio', born =1974 , oscar = True),
     Actor(name ='robert downey jr', born =1965 , oscar = False),
     Actor(name ='Gala gadot', born =1985 , oscar = False),
     Actor(name ='matthew mcconaughey', born =1969 , oscar = True),
     Actor(name ='norma shearer',born =1902 , oscar =True),
     Actor(name ='jackie chan',born =1954 , oscar = False),
     Actor(name ='Shahrukh Khan',born =1965 , oscar = False),
)
pprint(actors)
print()

Output :

(Actor(name='leonardo dicaprio', born=1974, oscar=True),
 Actor(name='robert downey jr', born=1965, oscar=False),
 Actor(name='Gala gadot', born=1985, oscar=False),
 Actor(name='matthew mcconaughey', born=1969, oscar=True),
 Actor(name='norma shearer', born=1902, oscar=True),
 Actor(name='jackie chan', born=1954, oscar=False),
 Actor(name='Shahrukh Khan', born=1965, oscar=False))

4. Creating The Function named transform in which we will modify some data and use that function for parallelizing the code with the multiprocessing module

def transform(x):
    print(f'Process id {os.getpid()} Processing data {x.name}')
    time.sleep(1)#delaying the function
    result = {'name': x.name, 'age': 2019 - x.born}
    print(f'Process id {os.getpid()} Done Processing data {x.name}')
    return result

In the above code, we are using the time. sleep function to delay the function speed so that we can be able to see the impact of the multiprocessing module and see how actually the parallel processing works. We are using os.getid function to see the different processing id’s that parallel processing will create to get the output of that function

5. Now let’s use the main function(recommended to use for Windows users) & inside the main function we will be using the multiprocessing module

if __name__ == '__main__':
    start = time.time()
    pool = multiprocessing.Pool() #it is an pool which like an interface that can run our function in parallel multipleacross the cpu cores
    result = pool.map(transform, actors)
    pool.close()
    end = time.time()
    print(f'\nTime to complete process:{end-start:.2f}s \n')#this will show the exact time required for process to complete 
    pprint(result)

Now, In the above block of code, we are defining an if main function(recommended for Windows users), instead of it we can also directly apply the multiprocessing tool, but we can encounter a runtime error while running the whole file on command prompt od bash. The multiprocessing.Pool() is kind of an interface that can run our function parallel across the different CPU cores and gets also  covers less computational file

So to See the Result save the code file with the .py extension and run it with the windows command prompt or the bash shell. After running the file the output generated will be as follows  :

Note:-I have saved the code as parallel.py on C Drive

Output :

Microsoft Windows [Version 6.3.9600]
(c) 2013 Microsoft Corporation. All rights reserved.

C:\Users\PT>python parallel.py
(Actor(name='leonardo dicaprio', born=1974, oscar=True),
 Actor(name='robert downey jr', born=1965, oscar=False),
 Actor(name='Gala gadot', born=1985, oscar=False),
 Actor(name='matthew mcconaughey', born=1969, oscar=True),
 Actor(name='norma shearer', born=1902, oscar=True),
 Actor(name='jackie chan', born=1954, oscar=False),
 Actor(name='Shahrukh Khan', born=1965, oscar=False))

(Actor(name='leonardo dicaprio', born=1974, oscar=True),
 Actor(name='robert downey jr', born=1965, oscar=False),
 Actor(name='Gala gadot', born=1985, oscar=False),
 Actor(name='matthew mcconaughey', born=1969, oscar=True),
 Actor(name='norma shearer', born=1902, oscar=True),
 Actor(name='jackie chan', born=1954, oscar=False),
 Actor(name='Shahrukh Khan', born=1965, oscar=False))

Process id 2652 Processing data leonardo dicaprio
(Actor(name='leonardo dicaprio', born=1974, oscar=True),
 Actor(name='robert downey jr', born=1965, oscar=False),
 Actor(name='Gala gadot', born=1985, oscar=False),
 Actor(name='matthew mcconaughey', born=1969, oscar=True),
 Actor(name='norma shearer', born=1902, oscar=True),
 Actor(name='jackie chan', born=1954, oscar=False),
 Actor(name='Shahrukh Khan', born=1965, oscar=False))

(Actor(name='leonardo dicaprio', born=1974, oscar=True),
 Actor(name='robert downey jr', born=1965, oscar=False),
 Actor(name='Gala gadot', born=1985, oscar=False),
 Actor(name='matthew mcconaughey', born=1969, oscar=True),
 Actor(name='norma shearer', born=1902, oscar=True),
 Actor(name='jackie chan', born=1954, oscar=False),
 Actor(name='Shahrukh Khan', born=1965, oscar=False))

(Actor(name='leonardo dicaprio', born=1974, oscar=True),
 Actor(name='robert downey jr', born=1965, oscar=False),
 Actor(name='Gala gadot', born=1985, oscar=False),
 Actor(name='matthew mcconaughey', born=1969, oscar=True),
 Actor(name='norma shearer', born=1902, oscar=True),
 Actor(name='jackie chan', born=1954, oscar=False),
 Actor(name='Shahrukh Khan', born=1965, oscar=False))

Process id 7680 Processing data robert downey jr
Process id 8336 Processing data Gala gadot
Process id 8356 Processing data matthew mcconaughey
Process id 2652 Done Processing data leonardo dicaprio
Process id 2652 Processing data norma shearer
Process id 7680 Done Processing data robert downey jr
Process id 7680 Processing data jackie chan
Process id 8336 Done Processing data Gala gadot
Process id 8336 Processing data Shahrukh Khan
Process id 8356 Done Processing data matthew mcconaughey
Process id 2652 Done Processing data norma shearer
Process id 7680 Done Processing data jackie chan
Process id 8336 Done Processing data Shahrukh Khan

Time to complete process:2.44s

[{'age': 45, 'name': 'leonardo dicaprio'},
 {'age': 54, 'name': 'robert downey jr'},
 {'age': 34, 'name': 'Gala gadot'},
 {'age': 50, 'name': 'matthew mcconaughey'},
 {'age': 117, 'name': 'norma shearer'},
 {'age': 65, 'name': 'jackie chan'},
 {'age': 54, 'name': 'Shahrukh Khan'}]

So from the above output, we can understand that the parallel processing is working perfectly and is divided into 4 different processes and the required time calculated to execute the or to complete the process using parallel processing was 2.44s(on my Device, It might differ for different devices using different CPU’s)

This method can also be done by using a simple map function that will not use multiprocessing lets give it a shot to it and notice the difference between timings. Just replace the above code with “if __main” function with the following one :

start = time.time()
result = tuple(map(
      transform,
    actors
))
end = time.time()
print(f'\nTime to complete process:{end-start:.2f}s \n')#this will show the exact time required for process to complete 
pprint(result)

 

Now, lets the save the file and run the whole file using command prompt or bash and observe the output

Output:

C:\Users\PT>python parallel2.py
(Actor(name='leonardo dicaprio', born=1974, oscar=True),
Actor(name='robert downey jr', born=1965, oscar=False),
Actor(name='Gala gadot', born=1985, oscar=False),
Actor(name='matthew mcconaughey', born=1969, oscar=True),
Actor(name='norma shearer', born=1902, oscar=True),
Actor(name='jackie chan', born=1954, oscar=False),
Actor(name='Shahrukh Khan', born=1965, oscar=False))

Process id 8740 Processing data leonardo dicaprio
Process id 8740 Done Processing data leonardo dicaprio
Process id 8740 Processing data robert downey jr
Process id 8740 Done Processing data robert downey jr
Process id 8740 Processing data Gala gadot
Process id 8740 Done Processing data Gala gadot
Process id 8740 Processing data matthew mcconaughey
Process id 8740 Done Processing data matthew mcconaughey
Process id 8740 Processing data norma shearer
Process id 8740 Done Processing data norma shearer
Process id 8740 Processing data jackie chan
Process id 8740 Done Processing data jackie chan
Process id 8740 Processing data Shahrukh Khan
Process id 8740 Done Processing data Shahrukh Khan

Time to complete process:7.01s

({'age': 45, 'name': 'leonardo dicaprio'},
{'age': 54, 'name': 'robert downey jr'},
{'age': 34, 'name': 'Gala gadot'},
{'age': 50, 'name': 'matthew mcconaughey'},
{'age': 117, 'name': 'norma shearer'},
{'age': 65, 'name': 'jackie chan'},
{'age': 54, 'name': 'Shahrukh Khan'})

C:\Users\PT>

Now, The above output after replacing the code from multiprocessing map to simple map function will sure make all clarification what exactly the parallel processing means, the above example clearly shows that the map function is showing the same output as multiprocessing but is only using single unique id to run all the process, and the total time required to complete full process is 7.01 seconds which is far greater than time required for the parallelized code that was only 2.44 seconds.

So Surely the parallel processing is one of the finest techniques that can be achieved by using python to run the complex code that requires too much time to execute by parallelizing the code and make it run across different cores available in CPUs.

hope My example has made up clear of achieving the parallel processing with python

(note: the reader must be aware of functional programming techniques for this article)

Leave a Reply

Your email address will not be published. Required fields are marked *