Create DataFrame from nested JSON in Python
Hello friends, today we are going to create a DataFrame from nested JSON in Python. We will use the Pandas library to create a DataFrame and JSON library for this purpose.
Create a DataFrame from nested-JSON
We must install the panda’s library to your local system to perform the task. Open your command prompt on Windows or Terminal on macOS/Linux and run the following command to install the dependency.
pip install pandas
Import the library to your code.
import pandas as pd
Now I’ll explain to you how to create a DataFrame from nested JSON with a basic example.
Sample JSON file :
{ "Courses": [ { "courseID": "BCV201", "course": { "name": "Introduction to Computer Vision", "faculty": "Salinger" } }, { "courseID": "BCE191", "course": { "name": "Basic Electronics", "faculty": "Peach N." } }, { "courseID": "BLM311", "course": { "name": "Law Management", "faculty": "Suisse Layford" } } ] }
I have opened my JSON file using Python’s inbuilt open()
function and passed the file’s name i.e. courses.json
as an argument. The load()
function in the code given below is used to retrieve the JSON object from the temporary variable, rd
which returns an equivalent Python data object. Then I used the json_normalize()
function which takes the equivalent Python object and the record path where you’ve to mention the field’s names you want to add to your data frame. I have used the rename()
function to rename the column names and then printed the data frame.
Code :
import pandas as pd import json rd = open("courses.json") data = json.load(rd) df = pd.json_normalize(data, record_path=['Courses']) df = df.rename(columns={ 'courseID': 'Course ID', 'course.name': 'Course Name', 'course.faculty_name': "Faculty's Name" }) print(df)
Output :
Course ID Course Name Faculty's Name 0 BCV2201 Introduction to Computer Vision Salinger 1 BCE1121 Basic Electronics Peach N. 2 BLM3011 Law Management Suisse Layford
Thus you can now create a DataFrame from nested JSON in Python.
Leave a Reply