HTML tables are a very common format for displaying information. When building scrapers you often need to extract data from HTML tables on web pages and turn it into some different structured format, for example, JSON, CSV, or Excel. In this article, we discuss how to extract data from HTML tables using Python and Scrapy.
If you’re involved in any kind of web data extraction project, you’ve probably heard about headless browser scraping.
If you’ve been using Scrapy for any period of time, you know the capabilities a well-designed Scrapy spider can give you.