How to Scrap Amazon Products with Python Scrapy

In this article we want to learn How to Scrap Amazon Products with Python Scrapy , so as we know that Amazon is one the most popular and big marketplace. it has millions of products, and finding specific items can be a difficult task, also Scrapy is an open source web scraping framework written in Python. It provides an easy way to extract data from websites by defining spiders, which are Python classes that define how to navigate web pages, extract data from them and store the scraped data. 

 

 

 

First of all we need to install Scrapy and you can use pip for that.

 

 

After that open a command prompt and navigate to the directory where you want to create your project.
Run the following command to create a new Scrapy project named amazon_scraper, this will create a directory structure for your project with different files and folders.

 

 

Now Open amazon_scraper/items.py file and define the structure of the item you want to scrape. for example, if you want to scrape product titles, prices, and ratings, you can define an item like this, especially in here we are not going to scrap all products from Amazon, we want to just scrap kitchen products.

 

 

In Scrapy a spider is responsible for defining how to crawl a website and extract data. Create a new Python file named amazon_spider.py in the amazon_scraper/amazon_scraper/spiders directory. open amazon_scraper/amazon_scraper/spiders/amazon_spider.py file and replace its content with the following code:

 

 

And now Open a terminal or command prompt and navigate to amazon_scraper directory. Run the following command to start the spider:

This will start the spider named amazon and save the scraped data to a CSV file named products.csv.

 

 

 

This will be our CSV file

How to Scrap Amazon Products with Python Scrapy
How to Scrap Amazon Products with Python Scrapy

 

 

 

Learn More

 
 

Leave a Comment