How to iterate over files in a given directory in Python

In this tutorial, I’ll share some techniques to iterate over files in a given directory and perform some actions in Python. There are several ways to iterate over files in Python, let me discuss some of them:

Using os.scandir() function

Since Python 3.5, we have a function called scandir() that is included in the os module. By using this function we can easily scan the files in a given directory. It only lists files or directories immediately under a given directory. It doesn’t list all the files/directories recursively under a given directory.

Let us take an example to understand the concept :

Suppose I want to list the .exe and .pdf files from a specific directory in Python.

#importing os module
import os

#providing the path of the directory
#r = raw string literal
dirloc = r"C:\Users\sourav\Downloads" 

#calling scandir() function
for file in os.scandir(dirloc):
    if (file.path.endswith(".exe") or file.path.endswith(".pdf")) and file.is_file():
        print(file.path)

It’ll print the path of the .exe and .pdf files present immediately in the given directory.

Using os.listdir() function

It’ll also return a list of files immediately present in a given directory. Like os.scandir() function it also doesn’t work recursively.

Let us take one example to understand the concept :

Suppose I want to list the .iso and .png files from a specific directory.

#importing os module
import os

#providing the path of the directory
#r = raw string literal
dirloc = r"C:\Users\sourav\Downloads"

#calling listdir() fucntion
for file in os.listdir(dirloc):
    if file.endswith(".iso") or file.endswith(".png"):
        print(os.path.join(dirloc, file))
    else:
        continue

It’ll also print the path of the .iso and .png files present immediately in the given directory.

To list the files and folders recursively in a given directory, please use below methods

Using os.walk() function

This function is also included in the os module.  This function will iterate over all the files immediately as well as it’ll iterate over all the descendant files present in the subdirectories in a given directory.

Let us take an example to understand the concept :

Suppose I want to list the .mp3 and .png files from a specific directory.

#importing os module
import os

#calling os.walk() function
#r = raw string literal
#os.path.sep = path separator
for subdirectories, directories, files in os.walk(r'C:\Users\sourav\Downloads'):
    for file_name in files:
        file_loc = subdirectories + os.path.sep + file_name

#printing .mp3 and .jpg files recursively
        if file_loc.endswith(".mp3") or file_loc.endswith(".jpg"):
            print (file_loc)

It’ll print the list of the files present in the given directory recursively.

Using glob.iglob() function

In glob module, we’ve iglob() function. We can use this glob.iglob() function for printing all the files recursively as well as immediately under a given directory.

Let us take one example to understand the concept :

Suppose I want to list the .zip and .exe files immediately from a specific directory.

#importing glob module
import glob

#printing zip files present in the directory
#r = raw string literal
for fileloc in glob.iglob(r'C:\Users\sourav\Downloads\*.zip'):
    print(fileloc)

#printing exe files present in the directory
#r = raw string literal
for fileloc in glob.iglob(r'C:\Users\sourav\Downloads\*.exe'):
    print(fileloc)

#Note :- It'll print the files immediately not recursively

As I said in the code, it’ll not print the files recursively but immediately. The glob module supports “**” directive, but to use this we have to pass the recursive = True parameter.

Let us take one more example to understand this concept:

Suppose I want to list all the .zip and .exe files recursively from a specific directory.

#importing glob module
import glob

#printing zip files present in the directory
#r = raw string literal
#we have to use the recursive=True parameter for recursive iteration
#we have to use "\**\*" at the end of the directory path for recursive iteration
for fileloc in glob.iglob(r'C:\Users\sourav\Downloads\**\*.zip',recursive=True):
    print(fileloc)

#printing exe files present in the directory
#r = raw string literal
for fileloc in glob.iglob(r'C:\Users\sourav\Downloads\**\*.exe',recursive=True):
    print(fileloc)

Using Path function from pathlib module

By using Path function from pathlib module, we can also iterate over files recursively under a specified directory and list them.

Let us take an example to understand the concept:

Suppose I want to list all the .exe files recursively from a specific directory.

#importing Path function from pathlib module
from pathlib import Path

#providing the path of the directory
#r = raw string literal
locations = Path(r'C:\Users\sourav\Downloads').glob('**/*.exe')

for loc in locations:
    #loc is object not string
    location_in_string = str(loc)
    print(location_in_string)

It’ll print the path of the .exe files present in the given directory recursively.

I hope now you’re familiar with the concept of how to iterate over files in a given directory in Python.

Also read:

Leave a Reply

Your email address will not be published. Required fields are marked *