In this Python NLP lesson we are going to learn about Python NLP TextBlob library, as we have already learned about NLTK library, now you can do the same functionalities using TextBlob, so TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.
These are the features for TextBlob.
- Noun phrase extraction
- Part-of-speech tagging
- Sentiment analysis
- Classification (Naive Bayes, Decision Tree)
- Tokenization (splitting text into words and sentences)
- Word and phrase frequencies
- Parsing
n
-grams- Word inflection (pluralization and singularization) and lemmatization
- Spelling correction
- Add new models or languages through extensions
- WordNet integration
Installation
First of all you need to install TextBlob, you can use pip for the installation.
1 |
pip install textblob |
Tokenization
So using tokenization we can break TextBlobs into words or sentences. so first let’s check word tokenization.
1 2 3 4 5 |
from textblob import TextBlob blob = TextBlob("hello friends. welcome to python natural language processing") blob.words |
So in the above example we have used word tokenization, if you run the code this will be the result.
1 2 |
WordList(['hello', 'friends', 'welcome', 'to', 'python', 'natural', 'language', 'processing']) |
For the sentences tokenization you can use this code.
1 |
blob.sentences |
This will be the result.
1 2 |
[Sentence("hello friends."), Sentence("welcome to python natural language processing")] |
Also you can make a word singular or plural in TextBlob.
1 2 3 4 |
from textblob import TextBlob text = TextBlob('Hello friends , this is nlp topic') text.words[1].singularize() |
Run the code this is the result.
1 |
'friend' |
Also you can do pluralize a word.
1 |
text.words[5].pluralize() |
TextBlob Lemmatization
Now let’s work on lemmatization.
1 2 3 4 5 |
from textblob import Word w = Word('octopi') w.lemmatize() |
This will be the result.
1 |
'octopus' |
Also we can give Parts of Speech in our lemmatize.
1 2 3 4 |
from textblob import Word w = Word('went') w.lemmatize('v') |
TextBlob Spell Correction
We can do spell correction in TextBlob.
1 2 3 4 5 |
from textblob import Word sents = TextBlob("I havv goood speling!") print(sents.correct()) |
TextBlob Noun Phrase Extraction
1 2 3 4 5 6 |
from textblob import Word mytext = TextBlob('you can get the courses from geekscoders website') for np in mytext.noun_phrases: print(np) |
TextBlob Parts of Speech Tagging(POS)
Now we want to learn about POS tagging in TextBlox, Part-of-speech tags can be accessed through the tags
property.
1 2 3 4 5 6 |
from textblob import Word mytext = TextBlob('you can get the courses from geekscoders website') for words, tag in mytext.tags: print(words, tag) |
If you run the code this will be the result.
1 2 3 4 5 6 7 8 |
you PRP can MD get VB the DT courses NNS from IN geekscoders NNS website VBP |
TextBlob N-Grams
As we have already learned that Ngrams are the combination of multiple words, now let’s create an example.
1 2 3 4 5 |
from textblob import Word mytext = TextBlob('you can get the courses from geekscoders website') for ngram in mytext.ngrams(2): print(ngram) |
If you run the code you can see that we have two consecutive words.
1 2 3 4 5 6 7 |
['you', 'can'] ['can', 'get'] ['get', 'the'] ['the', 'courses'] ['courses', 'from'] ['from', 'geekscoders'] ['geekscoders', 'website'] |
TextBlob Sentiment Analysis
Sentiment Analysis is used as the process of determining the sentiments behind the character sequence, it may be used whether the speaker or the person expressing the textual thoughts is in happy or sad mode. the sentiment property returns a namedtuple of the form Sentiment(polarity, subjectivity)
. the polarity score is a float within the range [-1.0, 1.0]. The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective.
1 2 3 4 5 |
from textblob import TextBlob mytext = TextBlob('iam very happy today') mytext.sentiment |
TextBlob Language Translation
Let’s create an example of language translation in textblob, in this code we have a Turkish text, first we want to detect the language.
1 2 3 4 |
from textblob import TextBlob mytext = TextBlob('kanalıma abone ol') mytext.detect_language() |
If you run the code this will be the result.
1 |
'tr' |
Now we can translate this from Turkish to English.
1 2 3 |
translated = mytext.translate(from_lang='tr', to='en') print(translated) |
This will be the result.
1 |
subscribe to my channel |