It is a method to write a function that hooks at the start and end of spider.
Place the following content directly under the project.
import scrapy
class SpiderHook(object):
@classmethod
def from_crawler(cls, crawler):
ext = cls
crawler.signals.connect(ext.spider_opened, signal=scrapy.signals.spider_opened)
crawler.signals.connect(ext.spider_closed, signal=scrapy.signals.spider_closed)
return ext
def spider_opened(self, spider):
#Processing at the start of spider
def spider_closed(self, spider):
#Processing at the end of spider
Then write the settings to load this class in settings.py.
EXTENSIONS = {
'<project name>.<file name>. SpiderHook': 100,
}
reference: https://doc.scrapy.org/en/latest/topics/extensions.html
Recommended Posts