From LLM-powered extraction to agentic pipelines, here's how AI is reshaping every stage of the web scraping workflow in 2026 and what it means for your stack.
by
Super-powers, toll booths and the new era of data collection
by
Is your AI coding assistant stuck in the past?
by
The Scrapy whisperer: Adrian Chaves on Web Scraping Copilot
by
Most Recent
Announcement
Shubber Gettogether 2018: Gathering Innovators and Thinkers
Leadership
Data Quality Assurance For Enterprise Web Scraping
Leadership
What I Learned As A Google Summer Of Code Student At Zyte
Leadership
GDPR Compliance For Web Scrapers: The Step-by-step Guide
From LLM-powered extraction to agentic pipelines, here's how AI is reshaping every stage of the web scraping workflow in 2026 and what it means for your stack.
by
Super-powers, toll booths and the new era of data collection
by
Is your AI coding assistant stuck in the past?
by
The Scrapy whisperer: Adrian Chaves on Web Scraping Copilot
by
Most Recent
Announcement
Shubber Gettogether 2018: Gathering Innovators and Thinkers
Leadership
Data Quality Assurance For Enterprise Web Scraping
Leadership
What I Learned As A Google Summer Of Code Student At Zyte
Leadership
GDPR Compliance For Web Scrapers: The Step-by-step Guide
2018 has been a great year at Zyte , so there was no better way to cap it off than with a company retreat in Lisbon, Portugal.
When it comes to web scraping, one key element is often overlooked until it becomes a big problem.
Google Summer of Code (GSoC) was such a great experience for students like me. I learned so much about open source communities as well as contributing to their complex projects.
Unless you’ve been living under a rock for the past few months you know that the EU’s General Data Protection Regulation (GDPR) is upon us.
Web scraping can look deceptively easy these days. There are numerous open-source libraries/frameworks, visual scraping tools, and data extraction tools that make it very easy to scrape data from a website.
Unbeknownst to many, there is a data revolution happening in finance. In their never ending search for alpha hedge funds and investment banks are increasingly turning to new alternative sources of data to give them an informational edge over the market.
Throughout the history of the financial markets information has been power.
Over the last couple weeks, GDPR has brought data protection center stage. What was once a fringe concern for most businesses overnight became a burning problem that needed to be solved immediately.
It’s been another standout year for Scrapinghub and the scraping community at large. Together we crawled 79.1 billion pages (nearly double 2016), with over 103 billion scraped records; what a year!
We’re very excited to announce a new look for Zyte!
This is a guest post from the folks over at Intoli, one of the awesome companies providing Scrapy commercial support and longtime Scrapy fans.
It got very easy to do Machine Learning: you install an ML library like scikit-learn or xgboost, choose an estimator, feed it some training data, and get a model that can be used for predictions.
2018 has been a great year at Zyte , so there was no better way to cap it off than with a company retreat in Lisbon, Portugal.
When it comes to web scraping, one key element is often overlooked until it becomes a big problem.
Google Summer of Code (GSoC) was such a great experience for students like me. I learned so much about open source communities as well as contributing to their complex projects.
Unless you’ve been living under a rock for the past few months you know that the EU’s General Data Protection Regulation (GDPR) is upon us.
Web scraping can look deceptively easy these days. There are numerous open-source libraries/frameworks, visual scraping tools, and data extraction tools that make it very easy to scrape data from a website.
Unbeknownst to many, there is a data revolution happening in finance. In their never ending search for alpha hedge funds and investment banks are increasingly turning to new alternative sources of data to give them an informational edge over the market.
Throughout the history of the financial markets information has been power.
Over the last couple weeks, GDPR has brought data protection center stage. What was once a fringe concern for most businesses overnight became a burning problem that needed to be solved immediately.
It’s been another standout year for Scrapinghub and the scraping community at large. Together we crawled 79.1 billion pages (nearly double 2016), with over 103 billion scraped records; what a year!
We’re very excited to announce a new look for Zyte!
This is a guest post from the folks over at Intoli, one of the awesome companies providing Scrapy commercial support and longtime Scrapy fans.
It got very easy to do Machine Learning: you install an ML library like scikit-learn or xgboost, choose an estimator, feed it some training data, and get a model that can be used for predictions.