In this Natural Language Processing (NLP) lesson we are going to learn about NLP Named Entity Recognition(NER).
What is NLP Named Entity Recognition(NER) ?
It is one of the most problem in the text labeling is finding the entities, for example finding the name of locations or organizations, we can call it Named Entity Recognition or NER , there are two ways of Named Entity Recognition in NLTK. the first way is that you can use pre trained Named Entity Recognition and the second way is that you can build a Machine Learning Based Model. from NLTK we can use ne_chunk() method for these kind of functionalities and also there is a wrapper around Stanford Named Entity Recognition Tagger.
Now let’s create a simple example, in this example we have used the binary to false, if we set the binary to true than it provides the output for the entire sentence tree and tag.
1 2 3 4 5 6 7 8 9 |
import nltk from nltk import ne_chunk from nltk.tag import pos_tag from nltk.tokenize import word_tokenize text = "his name is John, John is studying at Stanford University in California" print(ne_chunk(pos_tag(word_tokenize(text)), binary=False)) |
If you run the code this will be the result, you can see that John is PERSON, his is PREPOSITION, California is GPE and so on.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
(S his/PRP$ name/NN is/VBZ (PERSON John/NNP) ,/, (PERSON John/NNP) is/VBZ studying/VBG at/IN (ORGANIZATION Stanford/NNP University/NNP) in/IN (GPE California/NNP)) |