Python NLP
About Lesson

In this Python NLP lesson we are going to learn about Python NLP Introduction, we will learn that what is NLP (Natural Language Processing ) and how we can install NLTK (Natural Language Toolkit), in the first we will have Python NLP Introduction and after that we will go through the installation.



What is Natural Language Processing(NLP)

Natural Language Processing (NLP) is concerned with the interaction between natural language and the computer. also if you see Natural language processing is used everywhere, from search engines such as Google , to voice interfaces such as Siri. There are different other usages of NLP like spell checking, spam filtering, related keyword in search engines, knowledge base support ,  chatbots., machine translation, speech recognition and many more. and it is one of the major components of Artificial Intelligence (AI) and computational linguistics.




Usage of NLP

  • Spell correction (MS Word/ any other editor)
  • Search engines (Google, Bing, Yahoo)
  • Speech engines (Siri, Google Voice)
  • Spam classifiers (All e-mail services)
  • News feeds (Google, Yahoo!, and so on)
  • Machine translation (Google Translate, and so on)



What is NLTK (Natural Language Processing Toolkit)

Language Toolkit (NLTK) is a suite of libraries that has become one of the best tools for prototyping and building natural language processing systems. NLTK is one of the most popular and widely used library in the natural language processing (NLP) community. The beauty of NLTK lies in its simplicity, where most of the complex NLP tasks can be implemented using a few lines of code. Start off by learning how to tokenize text into component words. Explore and make use of the WordNet language dictionary.




You can simply use pip for the installation, also for more information about installation you can check NLTK Installation guide. 



After installation of the NLTK, you need to install data the for the NLTK.  NLTK comes with many corpora, toy grammars, trained models, etc. A complete list is posted at NLTK Data List. so for the installation we can use NLTK’s data downloader, you can Run the Python interpreter and type the commands:



A new window should open, showing the NLTK Downloader. Click on the File menu and select Change Download Directory. For central installation, set this to C:\nltk_data (Windows), /usr/local/share/nltk_data (Mac), or /usr/share/nltk_data (Unix). Next, select the packages or collections you want to download.


For more information about installation you can check this link NLTK Data Installation.



Test that the data has been installed as follows. (This assumes you downloaded the Brown Corpus):




This will be the result.