February 18, 2021
Scrapy update: Better broad crawl performance

When crawling the web, there’s always a speed limit. A spider can't fetch faster than the host willing to send the pages. Page serving takes some amount of resources - CPU, disk, network bandwidth, etc. These resources cost money. Unrestricted serving and extensive crawling are the worst combinations. Such a combination could bring applications to […]

