What is Scrapy

Scrapy is Python's crawling and scraping framework. By using this, you can code according to the framework's method instead of importing the library into your own code.

Install Scrapy

$pip install scrapy

Create a project

To create a project, run the following command.

$scrapy startproject (Project name)

The project name can be anything you like. If you execute it, you will get a lot of directories.

Set the download interval

If you don't download at intervals, it will put a load on the system you are crawling to, so you need to pay close attention to it.

Add the following statement to setting.py from the project name folder.

DOWNLOAD_DERAY = 1

Create item

It is a place to store what you got from crawling. Define a class in items.py.

class [name of the class](scrapy.Item):
    [The name of what you fetch] = scrapy.Field()

item = [name of the class]()
item['The name of what you fetch'] = 'Examples'

Creating a Spider

The details of crawling and scraping are mainly described in spider. Enter the following command to create a spider.

$scrapy genspider [spider name] [Domain of the site to fetch]

This will create a [spider name] .py file in the spider folder.

After this, spider will be described according to the site where crawling is performed.

I would appreciate it if you could point out any mistakes.