PINGDOM_CHECK

#ExtractSummit2026 The world's largest web scraping conference returns. Austin Oct 7–8 · Dublin Nov 10–11.

Register now
Data Services
Pricing
Login
Try Zyte APIContact Sales
  • Unblocking and Extraction

    Zyte API

    The ultimate API for web scraping. Avoid website bans and access a headless browser or AI Parsing

    Ban Handling

    Headless Browser

    AI Extraction

    Enterprise

    DocumentationSupport

    Hosting and Deployment

    Scrapy Cloud

    Run, monitor, and control your Scrapy spiders however you want to.

    Coding Agent Add-Ons

    Agentic Web Data

    Plugins that give coding agents the context to build production Scrapy projects. Starts with Claude Code.

  • Data Services
  • Pricing
  • Blog

    Learn

    Case Studies

    Webinars

    Videos

    White Papers

    Join our Community
    Web scraping APIs vs proxies: A head-to-head comparison
    Blog Post
    The seven habits of highly effective data teams
    Blog Post
  • Product and E-commerce

    From e-commerce and online marketplaces

    Data for AI

    Collect and structure web data to feed AI

    Job Posting

    From job boards and recruitment websites

    Real Estate

    From Listings portals and specialist websites

    News and Article

    From online publishers and news websites

    Search

    Search engine results page data (SERP)

    Social Media

    From social media platforms online

  • Meet Zyte

    Our story, people and values

    Contact us

    Get in touch

    Support

    Knowledge base and raise support tickets

    Terms and Policies

    Accept our terms and policies

    Open Source

    Our open source projects and contributions

    Web Data Compliance

    Guidelines and resources for compliant web data collection

    Join the team building the future of web data
    We're Hiring
    Trust Center
    Security, compliance & certifications
Login
Try Zyte APIContact Sales

Zyte Developers

Coding tools & hacks straight to your inbox

Become part of the community and receive a bi-weekly dosage of all things code.

Join us
    • Zyte Data
    • News & Articles
    • Search
    • Social Media
    • Product
    • Data for AI
    • Job Posting
    • Real Estate
    • Zyte API - Ban Handling
    • Zyte API - Headless Browser
    • Zyte API - AI Extraction
    • Web Scraping Copilot
    • Zyte API Enterprise
    • Scrapy Cloud
    • Solution Overview
    • Blog
    • Webinars
    • Case Studies
    • White Papers
    • Documentation
    • Web Scraping Maturity Self-Assesment
    • Web Data compliance
    • Meet Zyte
    • Jobs
    • Terms and Policies
    • Trust Center
    • Support
    • Contact us
    • Pricing
    • Do not sell
    • Cookie settings
    • Sign up
    • Talk to us
    • Cost estimator
Home
Blog
Introducing Zyte Web Data for Claude Code: Production-ready scraping from a prompt
Light
Dark

Introducing Zyte Web Data for Claude Code: Production-ready scraping from a prompt

Read Time
10 min
Posted on
June 3, 2026
Use case
Developers are embracing agentic coding tools - but data engineers need tools with specialist scraping skills.
By
Valter Sciarrillo
IntroductionWhat is it?The problem we’re solvingMost valuable use casesHow does it work?OrchestrationPipeline stages (called automatically by /scrape)UtilitiesDeploymentWho is this for?Try the plugin nowFinding further infoWhat’s next
×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Start FreeFind out more
Subscribe to our Blog
Table of Contents

Claude Code can help you write Python - but it doesn’t naturally boast the hard-won, opinionated web-scraping know-how that makes spiders reliable: the Scrapy patterns, the project structure, page objects, fixtures, and smoke tests that make the spider actually run.

Today, we’re excited to release Zyte Web Data for Claude Code: Zyte’s official Claude Code plugin that takes you from a plain-English prompt to a working Scrapy spider.

What is it?

Zyte Web Data for Claude Code is a Claude Code plugin that generates production-ready Scrapy spiders with web-poet page objects from natural-language instruction to run and extract data. Describe data, get data.

Give it a URL and describe what you want to extract. It handles:

  • Site exploration.
  • Schema discovery and approval (you confirm the fields before code generation).
  • Code generation (project scaffold, page objects and spider wiring).
  • Smoke testing so you get a runnable project, not a snippet.

Optionally, you can deploy directly to Scrapy Cloud for scheduled runs, job history, and monitoring.

The problem we’re solving

The friction in web scraping is rarely the first 30 minutes. It’s the next 30 days:

  • You want repeatability, not a one-off answer.
  • You want inspectable code, not a black box.
  • You want a pipeline you can deploy, monitor, and maintain.

Most agentic coding assistants can generate something Scrapy-shaped. But they often miss the details that matter in real projects, like correct base classes, coherent structure across files, tests, monitoring, and the practicalities of web access (rendering, anti-bot, and configuration).

Our view is simple: Claude alone will try to write Scrapy spiders - but it gets important details wrong. With Zyte Web Data skills, it gets them right.

With the combination, we can create agentic web data workflows that can be industrialized.

Most valuable use cases

This release is especially useful for teams who:

  • Build and maintain multiple spiders (and want consistency across them).
  • Need fast iteration on requirements (fields, edge cases, pagination, variants).
  • Care about production readiness: reliability, auditability, and maintainability.
  • Want to shorten the distance between “idea” and “job running in the cloud”.

Common examples:

  • Product catalogs and e-commerce monitoring.
  • Competitive intelligence feeds.
  • Marketplace inventory and pricing.
  • News/article monitoring (with structured extraction).

How does it work?

The plugin is packaged as a set of 14 reusable skills. The main one is /scrape, which orchestrates a five-stage pipeline automatically:

  1. Decide which fields to extract (/scrape-define)
  2. Analyze the website (/scrape-spec)
  3. Create the Scrapy project (/scrape-ensure-project)
  4. Generate the extraction code (/scrape-codegen)
  5. Generate the spider (/scrape-create-spider)

When the pipeline completes, you have a runnable spider and a passing-test suite:

uv run scrapy crawl <spider_name> uv run pytest fixtures/

The beauty of the environment, however, is that you don’t have to call any of these skills explicitly - Claude Code will figure out for itself when to invoke them based on the context of your task.

Zyte’s 14 skills span the entire scraping workflow…

Orchestration

Skill Description
scrape End-to-end web scraping workflow — from URL to working spider with web-poet page objects

Pipeline stages (called automatically by /scrape)

Skill Description
scrape-define Quick schema definition: explore one detail page, discover fields, fast approval loop
scrape-spec Explore diverse pages and validate the extraction spec: downloads pages, compares variants, optional browser review
scrape-explore-site Explore a website to find and save diverse pages (start, list, detail) with classified links
scrape-analyze-page Extract all available fields with values from a detail page
scrape-ensure-project Ensure a Scrapy project exists with scrapy-poet and Zyte API support
scrape-codegen Generate web-poet page object code from an extraction spec
scrape-codegen-analyze Analyze an HTML page to produce field extraction instructions for code generation
scrape-codegen-generate Generate web-poet page object code from per-page extraction analyses
scrape-create-spider Generate a Scrapy spider that wires page objects together

Utilities

Skill Description
scrape-add-page-object Add an empty web-poet page object to a Scrapy project
scrape-review-schema Generate an HTML review page for schema and extracted data verification

Deployment

Skill Description
scrape-scrapy-cloud Deploy projects, schedule spiders, list/stop jobs, and view items or logs on Scrapy Cloud
scrape-zyte-login Set up your Zyte account and credentials

Who is this for?

Zyte’s Claude Code plugin is for:

  • Web scraping engineers who want to move faster without lowering standards
  • Data engineers who need a repeatable pipeline, not a brittle script
  • Developers who occasionally need web data but don’t want to become scraping specialists
  • Teams building agentic systems that still require reliable web data as an input layer

If you’ve ever thought “I don’t want a demo - I want a spider that runs,” this is for you.

Try the plugin now

To install the plugin, run:

claude plugin marketplace add zyte-ai/claude-skills
claude plugin install zyte-web-data@zyte-ai

If Claude Code is already running, reload plugins in the active session:

/reload-plugins

After installation, quick-start:

/scrape https://books.toscrape.com/ products

Finding further info

Docs and install details are online now:

  • GitHub repo: https://github.com/zyte-ai/claude-skills
  • Claude Code docs: Discovering and installing plugins

If you hit an issue (unexpected prompts, excessive wall time/cost, broken flows), please open a GitHub issue with enough detail to reproduce - feel free to anonymize target sites/data.

What’s next

This is the first step in a larger end-to-end agentic workflow for web data.

Next, we’ll continue expanding distribution and integrations so the same capabilities can be used in more agent environments, beyond Claude Code, while keeping the core principle intact:

Across all these surfaces, the aim will be the same: Describe data. Get data. But also: ship something you can run again tomorrow.

×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Start FreeFind out more

Get the latest posts straight to your inbox

No matter what data type you're looking for, we've got you

G2.com

Capterra.com

Proxyway.com

EWDCI logoMost loved workplace certificateZyte rewardISO 27001 iconG2 rewardG2 rewardG2 reward

© Zyte Group Limited 2026