Use of iloc, loc, and ix for data selection in Python Pandas
In this tutorial, we will learn how to use functions such as iloc, loc, and ix for data selection in Python Pandas Dataframe. All three are used for indexing but are different in nature.
Indexing is a method used for locating and accessing data in a database. It makes data handling more easy and efficient. Let us understand where exactly can iloc, loc and ix be used:
- iloc is used when indexing is done based on the position of row/column.
- loc is used when indexing is done based on the name of row/column.
- ix is used when indexing can be done through both the position or name of row/column.
Using iloc for data selection
Let us first see the use of iloc
with the help of an example.
Here we are using Pandas library to make use of Dataframe. To learn how to make a Dataframe please Click here. We have created a DataFrame consisting of student names and their marks in Maths and Physics. Here names are taken separately as indexes. One of the points that we have to keep in mind is that in Python indexing starts from ‘0’. So the first row or the first column will be addressed with the help of the number ‘0’. Now to understand how to use iloc let us take the student database consisting of student’s names and marks.
import pandas as pd import numpy as np Student_data={'maths':[10,20,10],'physics':[30,10,10]} Df=pd.DataFrame(Student_data,index=["Ankit","Arpit","Arun"]) Df
Output:
maths | physics | |
---|---|---|
Ankit | 10 | 30 |
Arpit | 20 | 10 |
Arun | 10 | 10 |
Now further using iloc
in this Dataframe to access the second row:
Code:
Df.iloc[2]
Output:
maths 10 physics 10 Name: Arun, dtype: int64
As we can see through this output only the second row that is Arun’s Maths and Physics marks gets displayed as the output.
Using loc for data selection
Using loc
is a lot similar to how we have used iloc
to access data. The only difference is that here name of the row/column is taken. Let us see this by using ‘loc’ on the same DataFrame. Here we will use the row name Ankit to access the data of his marks in Maths and Physics.
Code:
Df.loc["Ankit"]
Output:
maths 10 physics 30 Name: Ankit, dtype: int64
As we can see from the output the record containing the marks of Ankit gets displayed. This is a very easy way to access data by using only the student’s name from the database.
Using ix for data selection
Now let us see how to use ix
for data selection using our student database. Here ix
becomes a multi-purpose way to access data efficiently. Using our student database we can try using ix with row/column number and row/column name for indexing.
Code:
Df.ix[1] Df.ix["Arpit"]
Output:
maths 20 physics 10 Name: Arpit, dtype: int64
maths 20 physics 10 Name: Arpit, dtype: int64
Here by using ix, we can see that the output displayed is the record of the student named ‘Arpit’ through first using row number ‘1’ and then by using row name ‘Arpit’. This shows the duo functionality of ‘ix’ which can be very helpful in handling data efficiently.
Leave a Reply