Sorting Data Frame objects in Python

Post Views: 1,062

In this tutorial, we will be going to discuss sorting data frames in the pandas library in Python. So basically what is a data frame?

A data frame is a two-dimensional representation of data organized in the form of rows and columns. A data frame can be created by using the pandas.DataFrame() method of the pandas package. For example,

import pandas 
my_data = {'Name':['Sachin', 'Sourabh', 'Subhojeet', 'Anirudh', 
            'Vedant', 'Abhishek', 'Shivam']}
df = pandas.DataFrame(my_data)
print(df)
print(type(df))

Output:

        Name
0     Sachin
1    Sourabh
2  Subhojeet
3    Anirudh
4     Vedant
5   Abhishek
6     Shivam

<class 'pandas.core.frame.DataFrame'>

Here we have created a data frame object of the data of a group of people. You can see the type of data frame object created.

Sorting DataFrame object in Python

Now let’s take a look at how to sort the data frame object. For sorting the data frame we use pandas.DataFrame.sort() method. Pandas sort_values() function sorts values in required order(either ascending or descending).

Syntax: DataFrame.sort_values(by, axis , ascending , inplace , kind , na_position)

by -> name of the column/columns to be sorted.
axis -> determines the axis to be sorted. Default: 0
ascending -> boolean value. If true sorts the given data frame in ascending order otherwise in descending order. Default: True
inplace -> boolean value. If true sorts the given data frame in place otherwise not in place. Default: False.
kind -> It determines the type of sorting technique used. It can take quicksort, heapsort, mergesort as the argument. Default: quicksort
na_position -> If first It puts all NaN’s in first. If last puts all the NaN’s in last.

Let’s first import our dataset into the program.

import pandas 
my_data = pandas.read_excel("Cricket World Cup Winners.xlsx")  
my_data

	Year	Host	Venue for Final	Team-1	Team-2	Winner	Margin
0	1975	England	Lord’s	WI	Aus	WI	17 runs
1	1979	England	Lord’s	WI	Eng	WI	92 runs
2	1983	England	Lord’s	Ind	WI	Ind	43 runs
3	1987	India	Kolkata	Aus	Eng	Aus	7 runs
4	1992	Australia, New Zealand	Melbourne	Pak	Eng	Pak	22 runs
5	1996	India, Pakistan, Sri Lanka	Lahore (Gdffi)	Aus	SL	SL	7 wickets
6	1999	England	Lord’s	Pak	Aus	Aus	8 wickets
7	2003	South Africa	Wanderers	Aus	Ind	Aus	125 runs
8	2007	West Indies	Bridgetown	Aus	SL	Aus	53 runs
9	2011	India, Pakistan, Sri Lanka, Bangladesh	Wankhede	SL	Ind	Ind	6 wickets
10	2015	Australia, New Zealand	Melbourne	NZ	Aus	Aus	7 wickets

Here is our data set which consists of all the world cup winners of cricket. Download the excel file here cricket
Now we can use the Dataframe.sort_values method to sort a particular column. For instance, here I have sorted the hostname columns in ascending order.

import pandas 
my_data = pandas.read_excel("Cricket World Cup Winners.xlsx")  
my_data.sort_values("Host", axis = 0, ascending = True,inplace = True, na_position ='last') 
print(my_data)

    Year                                    Host   Venue for Final Team-1  Team-2  Winner    Margin
4   1992                  Australia, New Zealand       Melbourne    Pak    Eng     Pak       22 runs   
10  2015                  Australia, New Zealand       Melbourne    NZ     Aus     Aus       7 wickets
0   1975                                 England          Lord's    WI     Aus     WI        17 runs
1   1979                                 England          Lord's    WI     Eng     WI        92 runs
2   1983                                 England          Lord's    Ind    WI      Ind       43 runs
6   1999                                 England          Lord's    Pak    Aus     Aus       8 wickets
3   1987                                   India         Kolkata    Aus    Ind     Aus       7 runs
5   1996              India, Pakistan, Sri Lanka  Lahore (Gdffi)    Aus    SL      SL        7 wickets
9   2011  India, Pakistan, Sri Lanka, Bangladesh        Wankhede    SL     Ind     Ind       6 wickets
7   2003                            South Africa       Wanderers    Aus    Ind     Aus       125 runs
8   2007                             West Indies      Bridgetown    Aus    SL      Aus       53 runs

Here you can see the Host column is sorted in ascending order.

You can also sort two multiple columns simultaneously.

Sorting Data Frame objects in Python

Sorting DataFrame object in Python

Leave a Reply Cancel reply

Related Posts