Create DataFrame from nested JSON in Python

Hello friends, today we are going to create a DataFrame from nested JSON in Python. We will use the Pandas library to create a DataFrame and JSON library for this purpose.

Create a DataFrame from nested-JSON

We must install the panda’s library to your local system to perform the task. Open your command prompt on Windows or Terminal on macOS/Linux and run the following command to install the dependency.

pip install pandas

Import the library to your code.

import pandas as pd

Now I’ll explain to you how to create a DataFrame from nested JSON with a basic example.

Sample JSON file :

{
   "Courses": [
      {
         "courseID": "BCV201",
         "course": {
            "name": "Introduction to Computer Vision",
            "faculty": "Salinger"
         }
      },
      {
         "courseID": "BCE191",
         "course": {
            "name": "Basic Electronics",
            "faculty": "Peach N."
         }
      },
      {
         "courseID": "BLM311",
         "course": {
            "name": "Law Management",
            "faculty": "Suisse Layford"
         }
      }
   ]
}

I have opened my JSON file using Python’s inbuilt open() function and passed the file’s name i.e. courses.json as an argument. The load() function in the code given below is used to retrieve the JSON object from the temporary variable, rd which returns an equivalent Python data object. Then I used the json_normalize() function which takes the equivalent Python object and the record path where you’ve to mention the field’s names you want to add to your data frame. I have used the rename() function to rename the column names and then printed the data frame.

Code :

import pandas as pd
import json

rd = open("courses.json")

data = json.load(rd)

df = pd.json_normalize(data, record_path=['Courses'])


df = df.rename(columns={
    'courseID': 'Course ID',
    'course.name': 'Course Name',
    'course.faculty_name': "Faculty's Name"
})

print(df)

Output :

  Course ID                      Course Name  Faculty's Name
0   BCV2201  Introduction to Computer Vision        Salinger
1   BCE1121                Basic Electronics        Peach N.
2   BLM3011                   Law Management  Suisse Layford

Thus you can now create a DataFrame from nested JSON in Python.

Leave a Reply

Your email address will not be published. Required fields are marked *