Introduction to Hierarchical clustering algorithm

In this tutorial, I am going to discuss another clustering algorithm, Hierarchical Clustering algorithm. As I have said before, clustering algorithms are used to group similar items in the same group called cluster. This post will be a basic introduction to the hierarchical clustering algorithm.

Hierarchical Clustering algorithm

The name itself describes what this algorithm will do. In this type of algorithm, clusters are formed by either top-down or bottom-up approach. This further divides this algorithm in two categories based on the approach followed:

  1. Agglomerative hierarchical clustering
  2. Divisive hierarchical clustering

Let us become more familiar with these two types.

Agglomerative hierarchical clustering

First of all, we are going to learn about this clustering method. This method follows bottom-up approach. This algorithm works by forming one cluster, then again forming sub clusters and so on to form one big cluster. In other words, suppose we have 5 points a,b,c,d and e. In the first step, a and b forms a cluster C1. Next, c and d will form a cluster C2. Now, C1 and C2 in final step will form a cluster with E. Generally, clusters that are formed first are small and they clusters in later stage are bigger than these. To visualize the clusters, dendrogram diagrams are used. Let’s now get straight into algorithm of this method:


  • Start with taking each data point as singleton cluster
  • Compute the distance between all pair of points. The resultant matrix will be a square matrix.
  • Now on the basis of the distance between pair of points, select the ones with minimum distance. Merge them into one cluster. Represent the cluster formed in dendrogram. I will discuss how to draw dendrogram in later sections.

  Note: When you are merging the points in one cluster, there are several distance metrics used to combine them like complete linkage, average linkage. I will be talking about these later.

  • Update the distance matrix by deleting the rows corresponding to points that have merged into cluster.
  • Now moving further, again calculate distance between all pair of points after merging of points is done.
  • Again merge the points between which the distance is minimum.
  • Keep merging the points and forming the clusters until there is one big cluster and side by side represent your clusters that are forming in dendrogram.

Divisive hierarchical clustering

This is another clustering method beside agglomerative one. This method uses top-down approach to form clusters. This algorithm initially assumes all points to be in one cluster. Then we break the cluster into small clusters  using some method. This splitting down of cluster continues until every data point is split into single cluster. For example, we have 4 points. Now initially there is one cluster which has all 4 points. In next step, we break this cluster into two clusters C1 and C2. Then these are further split into C1, C2, C3 and C4. In divisive , there are also  two methods Polythetic and Monothetic. I will be discussing polythetic one. Let’s now get straight into the algorithm:


  • Start with assuming all the data is in one cluster.
  • Compute the distance of each point from every other point. For ex- If we have 5 points, compute distance of 1 from 2,3,4,5 and sum it and so on.
  • Now the point which has maximum distance will be segregated from all other points i.e. now the cluster is divided into two.
  • Now again compute the distance of every point from other points(including distance computation within cluster) .
  • Take the difference of distances computed from each cluster. Segregate the point which has maximum difference.
  • Continue this algorithm until difference in distances is not negative.

   Distance measures

  1. Average Linkage – In this, average of distance between two points is taken while merging the points in one cluster.
  2. Complete Linkage – In this, maximum of distance between two points is taken (farthest) while merging the points in one cluster.
  3. Single Linkage – In this, minimum of distance between two points is taken (closest) while merging the points in one cluster.

 How to draw Dendrograms:

  1. Take the two points are getting merge at the first stage. On y-axis take the minimum distance between the two points that are getting merge and on x-axis comes the name of the point.
  2. Draw a square box of height equal to minimum distance with one side at one point and one side at other point.
  3. Now if this cluster is getting merge into other, draw one side from the top side of the square formed previously and other side and other side at other point with height equal to the distance between them.
  4. Keep forming these square structures until there is one big box.


Hope, you are clear with this topic. Feel free to post your doubts in comments.

Also, give a read to Introduction to Natural Language Processing- NLP

Leave a Reply

Your email address will not be published. Required fields are marked *