PINGDOM_CHECK

Explore resources by topic or category

Blog

How to Extract Data From HTML Table

Pawel Miech
5 Mins
August 13, 2023
HTML tables are a very common format for displaying information. When building scrapers you often need to extract data from HTML tables on web pages and turn it into some different structured format, for example, JSON, CSV, or Excel. In this article, we discuss how to extract data from HTML tables using Python and Scrapy.

Blog

Storing and Curating Your Web Crawling Data

Fernando Tadao Ito
9 Mins
August 4, 2023
Web crawlers are becoming increasingly popular in the era of big data, especially now with the advent of Large Language Models (LLMs) such as ChatGPT and LLaMA. The sheer amount of data that is publicly available from the web has a wide variety of applications including market research, sentiment analysis, and predictive modeling.

Blog

Python lxml tutorial | Guide to Web Scraping with python lxml library

Felipe Boff Nunes
6 Mins
May 18, 2023
Whether you're trying to analyze market trends or gather data for research, web scraping can be a useful skill to have. This technique allows you to extract specific pieces of data from websites automatically and process them for further analysis or use.

Blog

How to use XPath to extract web data

Valdir Stumm Junior
6 Mins
October 27, 2016
Let's start with what is XPath? XPath is a powerful language that is often used for scraping the web. It allows you to select nodes or compute values from an XML or HTML document and is actually one of the languages that you can use to extract web data using Scrapy.