Introduction to Natural Language Processing- NLP
In this era of Artificial Intelligence, we all must have heard about term Natural Language Processing either in universities or from some friend’s mouth. In this tutorial, we will be talking about NLP, a very famous field that comes under Artificial Intelligence.
Natural Language Processing & its applications-
Natural Language Processing is the field of study that concentrates on the associations between language used by humans and PCs. NLP allows machines to understand human language i.e. how humans speak by analyzing the texts. Most people have heard about NLP just with regard to identifying patterns in the text document sets.When all is said in done terms, NLP undertakings break language into shorter, natural pieces, attempt to understand connections between the pieces and investigate how the pieces cooperate to make meaning.
Where can we use NLP:
- We can create a chat bot using Parsey McParseFace, a language parsing deep learning model made by Google.
- Break up large text into small tokens using tokenizer or break words into their root words using stemmer.
- Group content into important points so that you can make a move and find trends.
- We can use Text Summarizer to extricate the most significant and focal thoughts while disregarding unessential data.
- To find and classify the sentiment of a string of content, from negative to impartial to positive, use Sentiment Analysis.
Steps that have to be performed to do basic Text Processing:
- First of all, we have to import dataset where we have to apply NLP.
- Now we have to text cleaning on this imported dataset. For that import necessary libraries.
import re import nltk nltk.download('stopwords') from nltk.corpus import stopwords from nltk.stem.porter import PorterStemmer
- Above mentioned could be some open source libraries that you can use for doing stemming, tokenizing, etc.
- Using a library called sklearn, we will be creating a bag of words model. For ex –
from sklearn.feature_extraction.text import CountVectorizer
- Next step is splitting dataset into training and testing dataset. For the sake of example, we can use below mentioned library.
from sklearn.cross_validation import train_test_split
- At this stage, we can apply a suitable algorithm like Naive Bayes, etc.
- After fitting the model, we can simply predict the test results using predict() function.
- At the end for the sake of see accuracy of the model created, we can create a confusion matrix using the library mentioned below:
from sklearn.metrics import confusion_matrix
So, this was a basic introduction to NLP. Hope to see you in the next tutorial!
Also give a read to,