A few years ago, I found myself at PyCon US during the twentieth anniversary of BeautifulSoup.
It was a fascinating session where people from all corners of the community stood up to share projects they had built using this incredible library. I was captivated by the sheer variety of applications. But more than that, I was struck by the intersection of this technical tool with real-world data and, by extension, with society itself.
That experience was my first deep introduction to the world of web scraping, and it set me on a path to understanding that this practice is never just a technical endeavor.
While I’m a QA Engineer by trade and don't professionally scrape data every day, my passion lies at the intersection of statistics, data, and society. I believe we can use data to better understand and improve the world. This belief has led me to explore web scraping not as a mere tool for data extraction, but as a social practice—an act that carries with it a host of decisions, considerations, ethical responsibilities, and necessary compromises.
In our data-hungry world, especially with the emergence of Large Language Models (LLMs) that scrape the web on a colossal scale, everyone is now a scraper, whether directly or indirectly. This makes it more crucial than ever to discuss how we can do it responsibly.





