Python NLP TextBlob - Geekscoders

Python NLP

About Lesson

In this Python NLP lesson we are going to learn about Python NLP TextBlob library, as we have already learned about NLTK library, now you can do the same functionalities using TextBlob, so TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

These are the features for TextBlob.

Noun phrase extraction
Part-of-speech tagging
Sentiment analysis
Classification (Naive Bayes, Decision Tree)
Tokenization (splitting text into words and sentences)
Word and phrase frequencies
Parsing
n-grams
Word inflection (pluralization and singularization) and lemmatization
Spelling correction
Add new models or languages through extensions
WordNet integration

Installation

First of all you need to install TextBlob, you can use pip for the installation.

pip install textblob

1	pip install textblob

Tokenization

So using tokenization we can break TextBlobs into words or sentences. so first let’s check word tokenization.

from textblob import TextBlob

blob = TextBlob("hello friends. welcome to python natural language processing")

blob.words

from textblob import TextBlob

blob = TextBlob("hello friends. welcome to python natural language processing")

blob.words

So in the above example we have used word tokenization, if you run the code this will be the result.

WordList(['hello', 'friends', 'welcome', 'to',
 'python', 'natural', 'language', 'processing'])

1 2	WordList(['hello', 'friends', 'welcome', 'to', 'python', 'natural', 'language', 'processing'])

For the sentences tokenization you can use this code.

blob.sentences

1	blob.sentences

This will be the result.

[Sentence("hello friends."),
 Sentence("welcome to python natural language processing")]

1 2	[Sentence("hello friends."), Sentence("welcome to python natural language processing")]

Also you can make a word singular or plural in TextBlob.

from textblob import TextBlob

text = TextBlob('Hello friends , this is nlp topic')
text.words[1].singularize()

from textblob import TextBlob

text = TextBlob('Hello friends , this is nlp topic')

text.words[1].singularize()

Run the code this is the result.

'friend'

'friend'

Also you can do pluralize a word.

text.words[5].pluralize()

1	text.words[5].pluralize()

TextBlob Lemmatization

Now let’s work on lemmatization.

from textblob import Word

w = Word('octopi')

w.lemmatize()

from textblob import Word

w = Word('octopi')

w.lemmatize()

This will be the result.

'octopus'

'octopus'

Also we can give Parts of Speech in our lemmatize.

from textblob import Word

w = Word('went')
w.lemmatize('v')

from textblob import Word

w = Word('went')

w.lemmatize('v')

TextBlob Spell Correction

We can do spell correction in TextBlob.

from textblob import Word

sents = TextBlob("I havv goood speling!")

print(sents.correct())

from textblob import Word

sents = TextBlob("I havv goood speling!")

print(sents.correct())

TextBlob Noun Phrase Extraction

from textblob import Word

mytext = TextBlob('you can get the courses from geekscoders website')

for np in mytext.noun_phrases:
    print(np)

from textblob import Word

mytext = TextBlob('you can get the courses from geekscoders website')

for np in mytext.noun_phrases:

print(np)

TextBlob Parts of Speech Tagging(POS)

Now we want to learn about POS tagging in TextBlox, Part-of-speech tags can be accessed through the tags property.

from textblob import Word

mytext = TextBlob('you can get the courses from geekscoders website')

for words, tag in mytext.tags:
    print(words, tag)

from textblob import Word

mytext = TextBlob('you can get the courses from geekscoders website')

for words, tag in mytext.tags:

print(words, tag)

If you run the code this will be the result.

you PRP
can MD
get VB
the DT
courses NNS
from IN
geekscoders NNS
website VBP

you PRP

can MD

get VB

the DT

courses NNS

from IN

geekscoders NNS

website VBP

TextBlob N-Grams

As we have already learned that Ngrams are the combination of multiple words, now let’s create an example.

from textblob import Word

mytext = TextBlob('you can get the courses from geekscoders website')
for ngram in mytext.ngrams(2):
    print(ngram)

from textblob import Word

mytext = TextBlob('you can get the courses from geekscoders website')

for ngram in mytext.ngrams(2):

print(ngram)

If you run the code you can see that we have two consecutive words.

['you', 'can']
['can', 'get']
['get', 'the']
['the', 'courses']
['courses', 'from']
['from', 'geekscoders']
['geekscoders', 'website']

['you', 'can']

['can', 'get']

['get', 'the']

['the', 'courses']

['courses', 'from']

['from', 'geekscoders']

['geekscoders', 'website']

TextBlob Sentiment Analysis

Sentiment Analysis is used as the process of determining the sentiments behind the character sequence, it may be used whether the speaker or the person expressing the textual thoughts is in happy or sad mode. the sentiment property returns a namedtuple of the form Sentiment(polarity, subjectivity). the polarity score is a float within the range [-1.0, 1.0]. The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective.

from textblob import TextBlob

mytext = TextBlob('iam very happy today')

mytext.sentiment

from textblob import TextBlob

mytext = TextBlob('iam very happy today')

mytext.sentiment

TextBlob Language Translation

Let’s create an example of language translation in textblob, in this code we have a Turkish text, first we want to detect the language.

from textblob import TextBlob

mytext = TextBlob('kanalıma abone ol')
mytext.detect_language()

from textblob import TextBlob

mytext = TextBlob('kanalıma abone ol')

mytext.detect_language()

If you run the code this will be the result.

'tr'

'tr'

Now we can translate this from Turkish to English.

translated = mytext.translate(from_lang='tr', to='en')

print(translated)

translated = mytext.translate(from_lang='tr', to='en')

print(translated)

This will be the result.

subscribe to my channel

1	subscribe to my channel