Graphs, Automatic Differentiation and Autograd in PyTorch Python

Fellow coders, in this tutorial section we are going to learn about automatic differentiation, graphs, and autograd in PyTorch Python. PyTorch is a very popular deep learning library of Python which has been the first choice of many programmers.

What is Automatic Differentiation:

Automatic differentiation is the building block of every deep learning library. PyTorch’s automatic differentiation engine is called Autograd. PyTorch supports reverse-mode automatic differentiation of scalar functions. Automatic differentiation in PyTorch has three unique features:

  1. In-place operations
  2. No tape
  3. Core logic in C++

Let’s look at an example use of the automatic differentiation module(torch.autograd):

from torch.autograd import Variable

x, prev_h = Variable(torch.randn(1, 10)), Variable(torch.randn(1, 20))
W_h, W_x = Variable(torch.randn(20, 20)), Variable(torch.randn(20, 10))

i2h = torch.matmul(W_x, x.t())
h2h = torch.matmul(W_h, prev_h.t())

(i2h + h2h).tanh().sum().backward()

Training a neural network consists of two phases:

  • a forward pass
  • a backward pass

What are Computation Graphs:

A computation graph is very similar to a normal graph but the nodes in computation graphs are operators. Some nodes are created as a result of mathematical operations whereas others are initialized by the user itself. This is true for all leaf nodes in the graph. We can compute gradients using the computation graphs. Every node in the computation graph can be considered as a function that takes some input and produces output.

We all have heard and know about NumPy arrays. Tensor is similar to numpy arrays with one difference that it can take advantage of parallel computation capability of GPU, which is a great thing.

Below is an example which shows how to create a tensor in PyTorch:

import torch

my_tnsr = torch.Tensor(6.7)


PyTorch’s graphs differ from TensorFlow graphs because they are generated simultaneously on the go and hence they are called dynamic computation graphs.

Let’s look at the final piece of code:

import torch 

a = torch.randn((5,5), requires_grad = True)

x1 = torch.randn((5,5), requires_grad = True)
x2 = torch.randn((5,5), requires_grad = True)
x3 = torch.randn((5,5), requires_grad = True)
x4 = torch.randn((5,5), requires_grad = True)

b = x1*a 
c = x2*a

d = x3*b + x4*c 

L = (10 -d).sum()


Also read:

Using bfloat16 with TensorFlow models in Python

Leave a Reply

Your email address will not be published. Required fields are marked *