Openpyxl – Iterate through all Rows in a Specific Column – Python

Hey, Fellow Python coder! In this tutorial, we’ll learn to iterate step-by-step through all the rows in a specific column in Excel using the Openpyxl library in Python programming.

Let’s get started!

You can check: How to install openpyxl in Python on any OS

Step 1 – Data Creation using Openpyxl 

To keep the tutorial exciting and fun, we will create a dataset for the Comic Books Database and operate on the same data in the later sections. Have a look at the code snippet below. To add rows of data, we will create a simple function as shown, which is quite simple to understand and implement if you are familiar with the openpyxl module.

import openpyxl

def addNewRow(sheet, title, price, publisher, genre, releaseYear):
  newRow = [title, price, publisher, genre, releaseYear]
  sheet.append(newRow)
  workbook.save('ComicBookDatabase.xlsx')

workbook = openpyxl.Workbook()
sheet = workbook.active

addNewRow(sheet, 'Comic Title','Comic Price', 'Comic Publisher', 'Comic Genre', 'Comic Release Year')
addNewRow(sheet, 'Black Panther','$3.99', 'Marvel', 'Superhero', 2019)
addNewRow(sheet, 'The Walking Dead','$2.50', 'Image Comics', 'Horror', 2003)
addNewRow(sheet, 'Wonder Woman','$2.99', 'DC Comics', 'Superhero', 2021)
addNewRow(sheet, 'Watchmen','$2.50', 'DC Comics', 'Mystery', 1986)

The resulting database looks like shown below:

Iterate through all Rows in a Specific Column

Step 2 – Iterate All Rows in a Specific Column Based on Column Indexing

In this section, we will start by iterating the row data by indexing the columns we have filled in the Excel sheet. This can be done quickly by directly mentioning the indexing, which in the case of Excel sheets is letters ( A being the first column and so on…). After agreeing on a target column based on the index, we will directly iterate the cells of the particular column using the looping concept in Python. Have a look at the code snippet below.

workbook = openpyxl.load_workbook('ComicBookDatabase.xlsx')
sheet = workbook.active

targetColumn = 'A'
for cell in sheet[targetColumn][1:]:
    print(cell.value)

The output of the code is as follows. As you can see, we get all the data present under the column of the Comic Title.

Black Panther
The Walking Dead
Wonder Woman
Watchmen

Step 3 – Iterate All Rows in a Specific Column Based on Column Headers

What if we wish to access the columns based on the header row (if there is one in the table)? First of all, we will create a dictionary that will assign a unique index to each column header in key-value pairs. Have a look at the code below:

header = sheet[1]
columnDict = {indx + 1: cell.value for indx, cell in enumerate(header)}
print(columnDict)

The code results in a dictionary shown below.

{1: 'Comic Title', 2: 'Comic Price', 3: 'Comic Publisher', 4: 'Comic Genre', 5: 'Comic Release Year'}

After we have achieved the dictionary, we will target the column directly by name-index mapping using the code snippet below. It might seem a little complex but the code snippet involves simple looping statements and enumerate function as shown in the code below.

targetCol = 'Comic Title'
targetColIndx = [indx for indx, cell in enumerate(sheet[1], start=1) if cell.value == targetCol][0]
for cell in sheet.iter_cols(min_col=targetColIndx, max_col=targetColIndx, min_row=2, values_only=True):
    print(cell)

The output of the code is as follows. As you can see, we get all the data present under the column of the Comic Title.

('Black Panther', 'The Walking Dead', 'Wonder Woman', 'Watchmen')

Thank you for reading!

Also Read:

  1. Write to an Excel file using the openpyxl module in Python
  2. How to read cell value in openpyxl in Python
  3. How to Lock Cells in OpenPyXL Python
  4. How to Change the Number Format in OpenPyXL – Python

Happy Learning!

Leave a Reply

Your email address will not be published. Required fields are marked *