How to select with condition in Pandas Dataframe using Python

In this tutorial, we will learn how to select certain rows or columns according to a specified condition in Dataframe using Pandas library in Python. It may get difficult to select a part of the Dataframe which you require for further computation. The following code can help you find part of the dataset that adheres to a given constraint. For example, if a teacher wishes to find out the names of students who have scored more marks belonging to a class, she may either hand-pick the records of students who fulfill the criteria or she may use the code below to return all the details of the students for the same.

Select with conditions in pandas Dataframe in Python

Let us make a simple Dataframe consisting of three columns namely names, marks, and sections with records of three students from different sections. We have assigned the name of this Dataframe as “Student_data”. Now Pandas Dataframe is easy to manipulate so first we need to import libraries such as Pandas and Numpy to get access to Dataframe function and perform computations. To make use of the following libraries we use import command and then give it a name. Usually, we give Pandas a name such as pd so that we can access the library directly using pd.command. Similarly, we import library Numpy as np so that we can access it as np.command. Now let us see how can we define our Dataframe.

             How to define a Pandas Dataframe

  • We define our Dataframe “Student_data” by writing the column names ‘Name’,’Marks’, ‘Section’ and further defining the columns with a dataset.
  • ‘Name’ of three students are Anuj, Karan and Manas. ‘Marks’ are given as 10,20 and 30 respectively and ‘Section’ is given as “A”, “B” and “C”.
  • Here one thing to keep in mind is that characters are defined with double quotations whereas numbers do not require it.
  • Another thing to keep in mind is that columns in a Dataframe are of equal length, that is if we have taken 3 names of students then we can only have 3 corresponding marks and 3 corresponding sections.
  • After columns have been defined individually, we now call the function that will give us access to Pandas Dataframe. This is done through pd.Dataframe.
  • We apply this function on our dataset “Student_data”  and we define the columns in our Dataframes by specifying the name of columns that we had defined earlier. We are now storing it in as “Table” and in the next step we give the output to print Table.

Select with condition in Pandas Dataframe using Python

Code:

import pandas as pd
import numpy as np
Student_data={'Name':["Anuj","Karan","Manas"],'Marks':[10,20,30],'Section':["A","B","C"]}
Table=pd.DataFrame(Student_data,columns=['Name','Marks','Section'])
Table

Output:

Name
Marks
Section
0
Anuj
10
A
1
Karan
20
B
2
Manas
30
C

 

As we can see that from the previous code and its output, we can now visualize our Dataframe with three rows and three columns. Now we add a condition to this Dataframe to filter out the required data.

  • Let us take this condition where we need the record of the students who are not from section C and have scored more than 15.
  • In our Dataframe Table, we take the column “marks” and apply the condition “> 15”.
  • We have one more condition that we want to adhere to. We use the “&” function and apply the second condition on the column “Section”. Condition being   !=  (which means not equal to) “C”.
  • Let us name this Dataframe as Filter and print the table Filter to get our result.

Code:

Filter=Table[(Table['marks']>15) & (Table['Section']!="C")]
Filter

Output:

Name
Marks
Section
1
Karan
20
B

Here with the help of applying the condition we find out that out of the three students only Karan has marks more than 15 and he also satisfies the condition of not being from section is C.

Also read:

Leave a Reply

Your email address will not be published. Required fields are marked *