Dataframe.stack() in pandas

In this tutorial, we will learn how to use the stack method on a data frame in pandas. It is used to change the structure of a data frame. It converts the data frame into multiple indexes and inner levels. The levels are sorted in the final result. It changes the shape of the existing data frame. So, let’s begin the tutorial.

Arguments of Dataframe.stack()

This method has the following arguments:

level: The default value is -1. Users can provide an integer value as input. It defines the level from column to index.

dropna: It takes a boolean value as an argument. By default, it is True.

Example 1

Create a data frame and use the stack method without any arguments.

import pandas as p
data={'x':[100,99,98,97], 'y':[50,49,48,47]}
d=p.DataFrame(data)
print(d)
print(d.stack())

OUTPUT:

  x   y
0 100 50
1 99  49
2 98  48
3 97  47

This is the data frame.

0 x 100
  y 50
1 x 99
  y 49
2 x 98
  y 48
3 x 97
  y 47
dtype: int64

Here, we see that the data is stacked.

Example 2

Create a data frame with multi-level columns and use the level argument.

import pandas as p
m=p.MultiIndex.from_tuples([('x','s'),('x', 't')])
n=p.DataFrame([[1,2],[3,4]],columns=m,index=['0','1'])
print(n)
print(n.stack())
print(n.stack(0))

OUTPUT:

  x 
  s t
0 1 2
1 3 4

This is the data frame.

    x
0 s 1
  t 2
1 s 3
  t 4

The data frame after using the stack() method without any arguments.

    s t
0 x 1 2
1 x 3 4

Here, the level 0 is stacked.

Example 3

Create a data frame and use the dropna argument.

import pandas as p
r=p.MultiIndex.from_tuples([('x','s'),('x', 't')])
t=p.DataFrame([[None,2],[3,None]],columns=r,index=['0','1'])
print(t)
print(t.stack())
print(t.stack(dropna=False))

OUTPUT:

  x 
  s   t 
0 NaN 2.0
1 3.0 NaN

This is the data frame

    x
0 t 2.0
1 s 3.0

This is the data frame after using the stack method. By default, the Nan values are not displayed in the final result. This is because the value for dropna is True if the value for argument is not provided.

    x
0 s NaN
  t 2.0
1 s 3.0
  t NaN

If we want to stack the data frame and display the Nan values in the final result, we have to use dropna=False
Also read: How to create an empty DataFrame with column names in Python?

Leave a Reply

Your email address will not be published.