PINGDOM_CHECK

Web Scraping Copilot is live. Build Scrapy spiders 3× faster, free in VS Code.

Install Now
Data Services
Pricing
Login
Try Zyte APIContact Sales
  • Unblocking and Extraction

    Zyte API

    The ultimate API for web scraping. Avoid website bans and access a headless browser or AI Parsing

    Ban Handling

    Headless Browser

    AI Extraction

    Enterprise

    DocumentationSupport

    Hosting and Deployment

    Scrapy Cloud

    Run, monitor, and control your Scrapy spiders however you want to.

    AI-powered IDE Integration

    Web Scraping-Copilot

    The complete, production-ready spider workflow from AI-generated code to cloud deployment. All in VS Code.

  • Data Services
  • Pricing
  • Blog

    Learn

    Case Studies

    Webinars

    Videos

    White Papers

    Join our Community
    Introducing Web Scraping Copilot 1.0: AI-Accelerated web scraping inside VS
    Blog Post
    The seven habits of highly effective data teams
    Blog Post
  • Product and E-commerce

    From e-commerce and online marketplaces

    Data for AI

    Collect and structure web data to feed AI

    Job Posting

    From job boards and recruitment websites

    Real Estate

    From Listings portals and specialist websites

    News and Article

    From online publishers and news websites

    Search

    Search engine results page data (SERP)

    Social Media

    From social media platforms online

  • Meet Zyte

    Our story, people and values

    Contact us

    Get in touch

    Support

    Knowledge base and raise support tickets

    Terms and Policies

    Accept our terms and policies

    Open Source

    Our open source projects and contributions

    Web Data Compliance

    Guidelines and resources for compliant web data collection

    Join the team building the future of web data
    We're Hiring
    Trust Center
    Security, compliance & certifications
Login
Try Zyte APIContact Sales

Zyte Developers

Coding tools & hacks straight to your inbox

Become part of the community and receive a bi-weekly dosage of all things code.

Join us
    • Zyte Data
    • News & Articles
    • Search
    • Social Media
    • Product
    • Data for AI
    • Job Posting
    • Real Estate
    • Zyte API - Ban Handling
    • Zyte API - Headless Browser
    • Zyte API - AI Extraction
    • Web Scraping Copilot
    • Zyte API Enterprise
    • Scrapy Cloud
    • Solution Overview
    • Blog
    • Webinars
    • Case Studies
    • White Papers
    • Documentation
    • Web Scraping Maturity Self-Assesment
    • Web Data compliance
    • Meet Zyte
    • Jobs
    • Terms and Policies
    • Trust Center
    • Support
    • Contact us
    • Pricing
    • Do not sell
    • Cookie settings
    • Sign up
    • Talk to us
    • Cost estimator
Home
Blog
Reflecting on the 2022 Web Data Extraction Summit: A Memorable Experience
Light
Dark
×
Subscribe to our Blog

Reflecting on the 2022 Web Data Extraction Summit | Zyte

We held the 2022 Web Data Extraction Summit three weeks ago. I wanted to extend a huge thank you to everyone who came, especially our guest speakers, who shared some great insights throughout the day.

The summit has changed a lot over the last few years, so I thought I’d take some time to reflect and talk about some of my favorite moments. Don’t worry if you couldn’t make it along. All of our recordings are now live, so if you’re interested in anything I mention, you can learn more using the link below.

A successful summit

Despite months of planning, I was still a little nervous as the day began. Between turnout, technical glitches, and cancellations, there was plenty that could’ve gone wrong, but it was all worth it. We had a good split of in-person and virtual guests, and nearly everyone stayed until the end.

In addition to the learning aspect, the summit had a great social atmosphere. It was encouraging to see attendees get along, engage with the speakers and come along to the after-party. I even met some attendees in a coffee shop the day after and we caught up before my flight home.

"Very professional and inclusive. Thought provoking content. Very good use of my time."

We tried experimenting with a new format from last year. Instead of parallel events, we delivered a more linear agenda which, I think, meant that there was plenty of variety throughout the day. One minute there was a live code demonstration, the next, our Chief Legal Officer, Sanaea Daruwalla, was walking people through the regulations and ethics of web scraping.

Overall, I think we managed to exhibit a healthy cross-section of current and future issues within data extraction. We’re still at a stage where knowledge transmission is slow, so I think attendees particularly enjoyed finding they shared the same struggles and learning new approaches.

The changing face of attendees

With things being back in-person this year, I was completely blown away by the effort that attendees made to come along. We had people appear from every corner of the globe, including the US, Europe, India, and Asia. We always wanted the Web Data Extraction Summit to be ‘the event’ for our industry, so I’m always taken back at just how popular it’s become.

The web data extraction industry feels so much larger nowadays and I think that growth is reflected in the types of people you meet and projects you hear about. Like previous years, we had lots of software engineers, developers and the like come along. However, one thing that stood out to me this time round was that attendees were more senior. I spoke to managers, team leads, CEOs, and even whole data teams.

I also had the chance to speak with a university professor who attended the summit. They mentioned that they’d previously been teaching web data extraction to post-grads but had recently delivered sessions to undergraduate students.

Altogether, I think this suggests we’re entering a new era of web data extraction where our tools and methods will become more mainstream and accessible - something that Victor Bolu highlighted in his talk The future of no-code web scraping.

Some of my favorite moments from this year

A lot of the talks were reflective of the trends I’d identified in my state of the industry address and it was interesting to see the different approaches to each topic. For example, Neil Emeigh’s session explored the use of web proxies to combat bans and how developers need to be mindful of ethical sourcing throughout.

This built on some of the compliance and scaling issues I had alluded to earlier and segued well into Sanaea Daruwalla’s discussion on the legal do’s and don’ts of web data extraction. As web extraction grows, I believe it’s important to champion these standards, especially in areas where we’re outpacing existing regulations.

Later, Glen De Cauwsemaecker gave an excellent presentation on maintaining data quality while growing your data feeds. The topic of scaling comes up each year, so it was refreshing to see how he balanced the need for actionable insights with his growth aspirations. Glen walked us through lessons from building extraction infrastructure over the last decade, much of which overlapped with James Kehoe’s session on the data maturity model.

Of course, it wouldn’t be a conference on web data extraction without numerous talks on crawling methods. We were joined by Peter Bray and Guillaume Pitel, each of whom explored how machine learning could hasten and enhance the data extraction and categorization processes. I think we’ll begin to see more of these tools as web data extraction is applied to new use cases and research questions, so make sure you catch their talks at some point.

Finally, two sessions really stood out for me personally for utilizing web data extraction in truly novel ways. Firstly was Alexander Lebedev’s session on data mining amid the Ukrainian war. He gave an intimate look into a global conflict and showed how web data extraction could help him navigate fundamental questions like when to sleep in a war zone. Secondly was Hannes Datta’s talk on the use of web data in academic research. Similar to Alexander’s talk, Hannes outlined how web data extraction could help researchers understand human behavior and online trends.

Watch the sessions for yourself

I’m incredibly grateful to everyone who came along to our 2022 summit, so I’d like to share one last thank you; we hope you enjoyed this year’s event as much as we did. We’re already making plans for next year, so if you’d like to stay in the loop and hear updates, you can register for early access tickets.

Alternatively, if you weren’t able to make it along or you’d like to watch some of the sessions again, I’m pleased to announce that all of our recordings are now live! You can watch them all for free as many times as you like using the link here.

×

Get the latest posts straight to your inbox

No matter what data type you're looking for, we've got you

G2.com

Capterra.com

Proxyway.com

EWDCI logoMost loved workplace certificateZyte rewardISO 27001 iconG2 rewardG2 rewardG2 reward

© Zyte Group Limited 2026
Read Time
4 Mins
Posted on
October 25, 2022
Announcement
By
Shane Evans

Try Zyte API

Try Zyte API

Zyte proxies and smart browser tech rolled into a single API.
Zyte proxies and smart browser tech rolled into a single API.

Reflecting on the 2022 Web Data Extraction Summit | Zyte

We held the 2022 Web Data Extraction Summit three weeks ago. I wanted to extend a huge thank you to everyone who came, especially our guest speakers, who shared some great insights throughout the day.
Start FreeFind out more
Start FreeFind out more