The Scrapy tutorial part VI: Scraping Infinite Scroll Pages With Python

Developed by Pablo Hoffman and Shane Evans, Scrapy is an open-source python framework built specifically for web data extraction. With Scrapy spiders, you are able to download HTML, parse and process the data and save it in either CSV, JSON, or XML file formats.

This video shows how to find and use underlying APIs that power AJAX-based infinite scrolling mechanisms in web pages.

After watching this video, you will know:

  • How to inspect the network requests from your browser
  • How to reverse engineer network requests
  • How to extract data from a JSON-based HTTP API

If you haven't yet, we recommend you to first watch part I, part II, part III, part IV, and part V of our tutorial series.