PINGDOM_CHECK
Light
Dark

How an analytics platform solved a ‘hard-to-scrape’ site using Zyte API

Read Time
10 Mins
Posted on
November 12, 2025
Behind the scenes of how Zyte API restored access to a major e-commerce site — automatically.
Table of Content

Every web data engineer has that one target site that’s just out of reach.


For one retail analytics platform helping brands track competitors’ prices, that site was a major e-commerce marketplace.


The company’s in-house data scraper could already:


  • Crawl product listings.

  • Extract structured data for pricing.

  • Refresh data hourly to feed analytics dashboards.


But the mission-critical marketplace proved problematic, thanks to its advanced anti-bot measures.


Bans spiked, CAPTCHA walls appeared, product pages returned 403 “Forbidden” errors. Over time, data extraction success rates dropped below 60%. For the analytics platform’s customers, market insight integrity was on the line.

The challenge

Searching for a solution, the company’s team tried:


  • Larger proxy pools, to widen the IP footprint.

  • Headless Chrome orchestration, for real-browser presentation.

  • Customizing headers in a quest for success.


While promising, the solutions did not hold up for more than a few days. The result: unstable feeds and costly maintenance cycles.


The target site was tricky because it combined multiple anti-bot layers:


  1. IP reputation tracking: Requests from identical network groups – or, Autonomous System Numbers (ASNs) – were throttled.

  2. TLS fingerprinting: Headless Chrome sessions were being identified as synthetic by inspecting the network “fingerprint” of each incoming connection.

  3. Dynamic JavaScript challenges: The site used tests requiring passing of proof tokens in HTTP requests, depending on full browser execution.

  4. Behavioral heuristics analysis: Machine-like scroll intervals triggered soft bans.


In other words, the company didn’t just have a “proxy problem.” It had an unblocking problem.

The turning point: automating unblocking

Instead of writing another custom workaround, the team switched its scraper’s request layer to Zyte API.


Integration time: Less than 30 minutes.


Code change: Just one request.

mport requests
import json
r = requests.post(
  "https://api.zyte.com/v1/extract",
    headers={
    "Authorization": "Basic <base64-enc-key>",
    "Content-Type": "application/json"
  },

 data=json.dumps({"url": "https://target-site.com/product/1234"})
)
print(r.text)
Copy

That’s it. Zyte API handled everything else:


  • Full browser rendering to handle the JavaScript challenge.

  • Smart IP rotation with session awareness.

  • Automatic retries on soft bans.

  • Dynamic fingerprint randomization.


Within minutes, the team’s scraper was back online—no manual tuning required.

The results

After a one-day rollout:


  • Success rates at the analytics platform jumped from 60% to 98%.

  • Latency decreased by 23%.

  • Monthly proxy costs dropped by 42%.

Metric
Before
After Zyte API

Success rate

60%

98%

Average latency

12.1s

9.3s

Proxy costs

$2,900/mo

$1,700/mo

Maintenance time

30 hrs/wk

4 hrs/wk

More than just incremental improvement, it represented infrastructure transformation.

The broader impact

Beyond success rate, Zyte API changed the company’s entire operating rhythm.


  • Faster iteration: New sites could be added in hours instead of days.

  • Simpler onboarding: Junior engineers could deploy crawlers without deep proxy knowledge.

  • Predictable spend: Pricing according to site complexity (easy, medium or hard) meant no surprises.

  • Confidence: Data delivery SLAs hit 99.9% reliability for the first time.


The engineering manager summarized it perfectly:


“We stopped fighting bans. Zyte API just gets the data.”

Why it works

Zyte API’s unblocking is built for evolving defenses:


  • Adaptive routing: Chooses the best IP pool and geo location based on target response patterns.

  • Full rendering engine: Executes JavaScript, cookies, and dynamic tokens in a real browser context.

  • Self-healing retry logic: Recognizes ban patterns and auto-adjusts strategy.

  • Observability: Built-in telemetry on success rate, latency, and cost per request.


This combination means no static configuration ever goes stale. The system learns from every request.

The takeaway: from fragile to future-proof

If you’re still managing proxies and browser fleets manually, you’re solving yesterday’s problem.


Modern unblocking is about automation, predictability, and performance.


Zyte API gives you all three in one endpoint.

curl "https://api.zyte.com/v1/extract" \
  -u "YOUR_API_KEY:" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'
Copy

That command is the new baseline for reliable web data.

Ready to see it yourself?

Try the same test our customer ran:


  1. Pick your most difficult site.

  2. Run it through Zyte APIs free trial.

  3. Compare success rates and latency.


📈 Start your free Zyte API trial – Unblock 90% of your target sites in hours, not weeks.

×

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.