PINGDOM_CHECK

7. Questions to ask when scaling web scraping

Here are some questions to ask your team and partners that can help you with your web scraping scaling journey.


Questions to ask the stakeholders


  • What business problem are we trying to solve with web data?

  • How does this align with our key business objectives?

  • Are you aware of the potential challenges and risks associated with web scraping at scale?

  • Will this effort consider initial capital investment and the required ongoing funding?

  • What’s the budget?

  • What is the timeline of this effort?

  • What is the priority of this effort compared to other initiatives?

  • Are we currently operating in a legal and ethical manner?

  • How well does our legal team understand the complexities of web scraping? 

  • Do we have resources who are experienced with managing large web scraping operations?

  • What are your expectations for return on investment for this effort?

  • Where do you see web scraping at scale in the business evolving in the next 5 years?


Questions to ask your QA and development teams


  • What tooling is part of our web scraping tech stack?

  • How scalable is our tooling? If it’s not scalable, what would we need to do to change that?

  • Do we have resources who are experienced with managing and maintaining large web scraping operations?

  • Do we have enough resources to scale?

  • What challenges have we encountered so far and how did we overcome them?

  • Do we have the infrastructure required to scale? If not, what is needed?

  • How are we currently managing website bans and how would these strategies change at scale?

  • How is the team currently managing web scraping projects?

  • What’s our current end-to-end workflow from development, testing, deployment, and maintenance?

  • How will scaling affect ongoing projects and priorities?


  • Do you trust data from targeted websites? 

  • How do you find reliable and quality web data?

  • What happens when you find conflicting data gathered from multiple sources?


Questions to ask a third-party


  • What industries have you worked with?

  • What data have you been asked to extract?

  • What is your web scraping tech stack?

  • How do you handle website changes and anti-bot technology?

  • How do you ensure the quality of the data at large volumes?

  • How do you ensure legal compliance within your scraping projects?

  • What ethical guidelines do you follow?

  • Are you certified by the EWDCI?

  • What’s your project management approach?

  • How do you communicate with clients?

  • What is your pricing structure and additional costs?

  • What is your security strategy to protect the data collected?

  • How do you handle security incidents?

  • What are your support strategies and SLAs?


Scaling a web scraping operation has a lot of moving parts. You can talk to our web scraping specialists to help you build a stronger business case to get the best results from your scaling efforts. And when scaling in-house, your development team should sign up for a free trial to Zyte API. They can get hands-on experience with an end-to-end tool for crawling, unblocking and extracting data that can help your business scale.