Contrastive loss for supervised classification in Machine learing using Python

Considering you are already familiar with some elementary Loss Functions like binary cross entropy loss function, Let’s talk about contrastive loss function for supervised classification in machine learning.

What is contrastive loss and when/how do we use it

Widely used loss functions are usually prediction-error based functions, like Hinge Loss. As opposed to that, contrastive loss takes into account the similarity of the features.

To measure the similarity between feature vectors, we can use methods like euclidian distance (for 1-dimensional, 2-dimensional or 3-dimensional vectors) or cosine similarity (for higher dimensional vectors).

For a given set of feature vectors X = {x_1, x_2} of N samples, using the contrastive loss results in minimized loss between closely categorized x_1 and x_2. The intuitive design of loss function is such that its minimization leads to a reduction in distance of positive pairs and a surge in the distance of negative pairs. Therefore, closely related training examples of feature vectors can be embedded/grouped together and correctly classified in their respective category.

Let’s take a look at the equation of contrastive loss:

Suppose X = set of feature vectors, Y = label vector (binary or categorical) W = trainable parameter, and m = margin (within which the distance is either “close” or “far”).

Contrastive loss for supervised classification in Machine learing using Python

Here, D is the euclidian distance between the feature vectors X1 and X2.

The Contrastive loss function is used as either an alternative to binary cross entropy, or they can be combined as well. It has a broad scope of usage in supervised as well as unsupervised machine learning tasks. Major use of the loss function is in binary as well as multi-class classifiers.

This function is simple to implement using numpy library. Let’s start by initializing the feature vectors and the label vector.

x1 = np.random.randn(10)
x2 = np.random.randn(10)
y = np.array([0, 0, 1, 1, 1, 0, 0, 1, 0, 1])

Now, let’s define the function contrasive_loss:

def contrastive_loss(input_1, input_2, label, margin):
  squared_distance = np.sum(np.square((input_1 - input_2)))
  loss_function = label*squared_distance*0.5 + (1 - label)*np.square(max(0, (margin - np.sqrt(squared_distance))))
  return np.sum(loss_function)/len(input_1)

We get the contrastive loss calculated in next step:

loss = contrastive_loss(x1, x2, y, 0.5)
print(loss)

Output(The answers differ because of the random initialization):

9.558838190827155

We suggest you to further read the following:

Leave a Reply