Field notes from the world of data extraction.

Articles, interviews and analysis on how data is gathered, used and fought over — written by the people closest to it.

⌕

Use case

Introducing Zyte Web Data for Claude Code: Production-ready scraping from a prompt

Developers are embracing agentic coding tools - but data engineers need tools with specialist scraping skills.

Valter Sciarrillo10 min readJune 3, 2026

Use case

Copilot like a pro: Eight tips that supercharged my workflow

AI-assisted coding is a revelation. But are you getting the most out of your IDE’s sidebar sidekick?

Shivanshu Srivastava10 min readJune 2, 2026

How To

Automate deployment of your web scraper on VPS with Ubuntu 24.04 cloud-init

Your VPS is ready, but now you need to work through the same sequence you have run a dozen times before: apt update, apt install python3-pip, pip install scrapy, playwright install chromium, the Chromium dependency list that never installs cleanly on the first try, Redis, possibly Postgres, whatever else this particular project needs.

Ayan Pahwa9 min readMay 31, 2026

Large Language Models (LLMs)

What multi-agent orchestration looks like in a large-scale web scraping project

Multi-agent orchestration is having its moment. The diagrams are everywhere now. Boxes for planners, boxes for hands, boxes for daemons, arrows to a shared brain, a human floating at the top. They keep getting prettier. The part where the web pushes back is still the part nobody draws.

Neha Setia Nagpal18 min readMay 31, 2026

Use case

Announcing powerful new spending controls and usage insights for Zyte API

Consign bill-shock to the trashcan. New custom spending limits and usage insights put data-gatherers in control.

Valter Sciarrillo10 min readMay 28, 2026

Use case

Meet the new-look Zyte Domain Health Hub: Your command center for data extraction performance

Monitor your data-gathering pipelines like a boss - and act on domain issues in real-time.

Valter Sciarrillo10 min readMay 27, 2026

Large Language Models (LLMs)

NotAnInterview: “I Have Superpowers Now"

The problem was a project with 12,000 websites to crawl, and there’s no world where you write custom spiders for 12,000 websites, not with a human team and certainly not sustainably. So Javier built a workflow: a set of AI prompts that could analyze a website, figure out its structure, and generate a crawl configuration that a generic spider could then use.

Neha Setia Nagpal9 min readMay 27, 2026

Web scraping APIs

Building a self-hosted browser scraping service (is it more hassle than its worth?)

If you want to understand exactly how a browser scraping service works at the infrastructure level, or you have a steady workload that you want running on hardware you already own, building one yourself teaches you things that matter. Here's how I did it

John Rooney8 min readMay 26, 2026

How To

Web scraping on 22 KB of RAM: Fitting the world on an ESP8266 microcontroller

Data-gathering doesn’t have to be memory-intensive. You can fit the world’s weather on a 9cm-square board, when you move the work to a web scraping API.

Ayan Pahwa8 min readMay 25, 2026

Large Language Models (LLMs)

I built scraping agents for 30 days - here’s what I learned

For the last 30 days, I did one thing almost exclusively: I built scraping systems with AI agents, from the ground up, across real targets, with real deadlines. Not prototypes designed to impress in a demo, not isolated experiments running against a toy website, but production-grade pipelines that needed to ship and keep running.

John Rooney11 min readMay 25, 2026

How To

I'm not the same developer I was before LLMs

I've been running a series of conversations with developers at Zyte to understand what's actually changed in the way they work since LLMs showed up. Not the headlines. The day-to-day. What they delegate, what they don't, what they notice, what surprises them. This one was different on two counts.

Neha Setia Nagpal15 min readMay 25, 2026

How To

Flatcar Linux for web scrapers: deploy immutable containers with just one config file

The next time you spin up a VPS to give it a persistent home, you spend the better part of an afternoon rebuilding from memory. Here's a tool to help using Flatcar Linux

Ayan Pahwa11 min readMay 25, 2026

Get the latest posts straight to your inbox

No matter what data type you're looking for, we've got you