Finding and using Euclidean distance using scikit-learn

To find the distance between two points or any two sets of points in Python, we use scikit-learn. Inside it, we use a directory within the library ‘metric’, and another within it, known as ‘pairwise.’ A function inside this directory is the focus of this article, the function being ‘euclidean_distances( ).’

The Euclidean distance between any two points, whether the points are in a plane or 3-dimensional space, measures the length of a segment connecting the two locations. It is the most prominent and straightforward way of representing the distance between any two points.

How to get Scikit-Learn

Given below are a couple of processes to get scikit-learn into your usable python library:

  1.  Go to pypi.org, search for scikit-learn, and install it. Save it into your Python 3 library
  2.  The simpler and more straightforward way (in my opinion) is to open terminal/command prompt and type
pip install scikit-learn
# OR #
conda install scikit-learn

These methods should be enough to get you going!

Usage And Understanding: Euclidean distance using scikit-learn in Python

Essentially the end-result of the function returns a set of numbers that denote the distance between the parameters entered. Here is a working example to explain this better:

import numpy as np
from sklearn.metrics.pairwise import euclidean_distances

points1 = np.asarray([[1,2,3.5],[4,1,2],[0,0,2],[3.4,1,5.6]]) 
test = euclidean_distances(points1,points1) 

print(test)

Here is what’s happening. After importing all the necessary libraries into the program, an array of another array of integers is defined. Each element of this array contains three decimal numbers defined. These elements represent the points in 3D space. They are put into ordered arrays using numpy.assaray( ) function, and finally the euclidean_distances( ) function comes into play.

Here is the output:

[[ 0.          3.5         2.6925824   3.34215499]
 [ 3.5         0.          4.12310563  3.64965752]
 [ 2.6925824   4.12310563  0.          5.05173238]
 [ 3.34215499  3.64965752  5.05173238  0.        ]]

This output means that the function in question returns a set of values in the form of an array of integer array. Each element contains the distance between one point as compared to the other locations in the second array passed into the function. For example, the first row of the output shows the distances between the first point of the array1 to all of the points of array2.

Hopefully, this article has helped you in understanding the workings and usage of euclidean distances in Python 3 using the library ‘scikit-learn’.

 

You may also like to read:

Simple Example of Linear Regression With scikit-learn in Python

Leave a Reply

Your email address will not be published. Required fields are marked *