Implementing an LSTM cell in Python

In this article, we will learn how to implement an LSTM Cell in Python. We will also see how RNN LSTM differs from other learning algorithms. So before moving to implementation let us discuss LSTM and other terminologies.

Recurrent Neural Network

In RNN we will give input and will get output and then we will feedback that output to model. So that at each time step model does not only consider input but also previous outputs to get current output.

Now suppose we have an input:-

“RAHUL IS A NICE PERSON BUT SOMETIMES HE ACTS FOOLISHLY.”

If we predict the nature of this sentence with any Machine learning algorithm we might conclude it is a positive sentence but with the RNN we mostly get that it is a negative sentence because RNN also considers previous words such as “BUT” and “FOOLISHLY”. This is the advantage of RNN over other learning algorithms.

Long Short-Term Memory – LSTM

In RNN we have various types of memory cells which store previous data while training and predicting the output data and the most populous among them is Long Short-Term Memory. It stores the previous sequence and also has a carry which makes ensure that sequence is not lost.

There were various memory cells for RNN but the problem with them is that we cannot use them to consider big data. For example, if we give a big paragraph as our input we might get an output that neglects beginning words. That’s why we use LSTM’s and GRU’s because they have gates that let the learning mode not to consider irrelevant information. It has 3 gates basically – Forget gate, Input gate, Output gate. Here forget gate decides whether to consider input or not.

Python Code Implementation

We are going to use the Keras library to solve our purpose of implementing LSTM.

from keras.layers import Dropout
from keras.layers import LSTM
from keras.models import Sequential 
from keras.layers import Dense
import numpy
model = Sequential()
model.add(LSTM(256, input_shape=(X.shape[1], X.shape[2])))  #X is any Input
model.add(Dropout(0.2))
model.add(Dense(y.shape[1], activation='softmax'))   #y is any Output

This is the basic code in python for the implementation of LSTM. Initially, we imported different layers for our model using Keras. After that, we made out the model having the LSTM layer and other layers according to our purpose of interest and in the end, we used activation function ‘softmax’ to get a value representing our output. You can apply this model in various places where RNN is required such as NLP, an audio file, etc.

I hope you enjoyed this article. Thank you!

Leave a Reply

Your email address will not be published. Required fields are marked *