Implementing an LSTM cell in Python
In this article, we will learn how to implement an LSTM Cell in Python. We will also see how RNN LSTM differs from other learning algorithms. So before moving to implementation let us discuss LSTM and other terminologies.
Recurrent Neural Network
In RNN we will give input and will get output and then we will feedback that output to model. So that at each time step model does not only consider input but also previous outputs to get current output.
Now suppose we have an input:-
“RAHUL IS A NICE PERSON BUT SOMETIMES HE ACTS FOOLISHLY.”
If we predict the nature of this sentence with any Machine learning algorithm we might conclude it is a positive sentence but with the RNN we mostly get that it is a negative sentence because RNN also considers previous words such as “BUT” and “FOOLISHLY”. This is the advantage of RNN over other learning algorithms.
Long Short-Term Memory – LSTM
In RNN we have various types of memory cells which store previous data while training and predicting the output data and the most populous among them is Long Short-Term Memory. It stores the previous sequence and also has a carry which makes ensure that sequence is not lost.
There were various memory cells for RNN but the problem with them is that we cannot use them to consider big data. For example, if we give a big paragraph as our input we might get an output that neglects beginning words. That’s why we use LSTM’s and GRU’s because they have gates that let the learning mode not to consider irrelevant information. It has 3 gates basically – Forget gate, Input gate, Output gate. Here forget gate decides whether to consider input or not.
Python Code Implementation
We are going to use the Keras library to solve our purpose of implementing LSTM.
from keras.layers import Dropout from keras.layers import LSTM from keras.models import Sequential from keras.layers import Dense import numpy model = Sequential() model.add(LSTM(256, input_shape=(X.shape[1], X.shape[2]))) #X is any Input model.add(Dropout(0.2)) model.add(Dense(y.shape[1], activation='softmax')) #y is any Output
This is the basic code in python for the implementation of LSTM. Initially, we imported different layers for our model using Keras. After that, we made out the model having the LSTM layer and other layers according to our purpose of interest and in the end, we used activation function ‘softmax’ to get a value representing our output. You can apply this model in various places where RNN is required such as NLP, an audio file, etc.
I hope you enjoyed this article. Thank you!
This is an LSTM implementation in Keras, not Python per se 🙂 A “Python” implementation would consist of LSTM operators written in native Python from scratch.
Keras, tensorflow, sklearn are part of python libraries only. The are build keeping in mind to reduce the time and effort individual. So basically it is implementation in python itself but if want you can do implementation from scratch also if you want to dive deeper into this and learn about this more.