How to serialize Python Dictionary to XML

Serialization refers to the process of translating a data structure or an object state into a format that can be stored in a database or transmitted over a network for reconstruction possibly in a different environment.

While JSON is being widely used as a serialization format, XML has its own advantages too and it was the popular serialization format before JSON. This article will explain how to serialize Python Dictionary to XML with example code.

The dicttoxml module in Python

A module called dicttoxml can be used to convert a Python dictionary into a valid XML string. This module can be installed from PyPi. The installation is very simple if pip is already installed.

pip install dicttoxml

This module has a function called dicttoxml that can convert a dictionary into a valid XML string.

The function dicttoxml

The function dicttoxml has the signature

dicttoxml(obj, root=True, custom_root='root', ids=False, attr_type=True, item_func=default_item_func, cdata=False)

where the required argument obj represents the object to be converted to an XML string

  1. The value of the attribute custom_root will be the name of the root tag in the XML
  2. The value of the argument item_func is the function that returns a name that will be used to wrap the items in a list
  3. argument attr_type defines whether or not to use attributes to specify the type of value
  4. cdata defines whether or not wrap the values in CDATA
  5. ids defines whether or not to specify a unique ID for every tag
  6. Setting the argument root to False, the elements will not be wrapped in a root element. This can be helpful if the XML string were to be used as a part of another XML.

An Example

Let the object to be converted to XML be

>>> student = {
    'name': 'Nina',
    'grade': '8',
    'regno': '201750ID01',
}

Executing the following statements, the student dictionary can be converted into a valid XML string.

>>> import dicttoxml
>>> xml = dicttoxml.dicttoxml(student)
>>> print(xml.decode())

This prints the following XML string for the student dictionary

<?xml version="1.0" encoding="UTF-8" ?><root><name type="str">Nina</name><grade type="str">8</grade><regno type="str">201750ID01</regno></root>

But the string is not readable. It can be made readable with functions from another module called xml.dom.minidom. The parseString() is a function that can parse an XML string and toprettyxml() is a method that can format an XML accordingly. These methods can also be chained to get the required pretty output. Here’s how it can be done

>>> from xml.dom.minidom import parseString
>>> parsedxml = parseString(xml)
>>> print(parsedxml.toprettyxml())

This prints the following pretty formatted XML string

<root>
        <name type="str">Nina</name>
        <grade type="str">8</grade>
        <regno type="str">201750ID01</regno>
</root>

Now, let the object be an array of dictionaries

students = [
    {
        'name': 'Nina',
        'grade': '8',
        'regno': '2020ID01'
    },
    {
        'name': 'Radha',
        'grade': 8,
        'regno': '2020ID02'
    },
    {
        'name': 'Suraj',
        'grade': 8,
        'regno': '2020ID03'
    }
]

The following Python script will print the pretty XML string.

from dicttoxml import dicttoxml
from xml.dom.minidom import parseString
students = [
    {
        'name': 'Nina',
        'grade': '8',
        'regno': '2020ID01'
    },
    {
        'name': 'Radha',
        'grade': 8,
        'regno': '2020ID02'
    },
    {
        'name': 'Suraj',
        'grade': 8,
        'regno': '2020ID03'
    }
]
studentsxml = dicttoxml(students, custom_root='students', attr_type=False, item_func=lambda _: 'student')
print(parseString(studentsxml).toprettyxml("    "))

It can be seen that

  • The value of the argument item_func is a lambda function that returns a string. Since studentsis a list, this string is used to wrap the individual items in the list.
  • The argument custom_root is ‘students’. So the root element will have the name ‘students’.
  • attr_type is set to False. So the elements will not have a type attribute.

This produces the output as expected

<?xml version="1.0" ?>
<students>
    <student>
        <name>Nina</name>
        <grade>8</grade>
        <regno>2020ID01</regno>
    </student>
    <student>
        <name>Radha</name>
        <grade>8</grade>
        <regno>2020ID02</regno>
    </student>
    <student>
        <name>Suraj</name>
        <grade>8</grade>
        <regno>2020ID03</regno>
    </student>
</students>

Since the argument of toprettyxml() is ”    “(4 spaces), the elements are indented with 4 spaces appropriately.

So we successfully able to serialize the Python dictionary into XML.

Leave a Reply

Your email address will not be published. Required fields are marked *