G2.com

No matter what data type you're looking for, we've got you
A Conversation With Julia Medina and Mihaela Popova
First, it was a three-way conversation. I wanted Julia Medina and Mihaela Popova in the same room, both engineers working on very different parts of Zyte's stack, both using LLMs deeply, both with strong opinions. When I blocked their calendars, I spent a long time finding two hours where a morning in Argentina, an afternoon in Bulgaria, and an Indonesian evening actually overlapped. They did.
Second, there aren't many women on the internet talking about web scraping. That fact has bothered me for a while. So this conversation is also a small correction.
What I didn't expect was how much Julia and Mihaela would converge. They work on different projects, use different tools, and describe different workflows. And when I asked them to finish the sentence “I'm not the same developer I was before LLMs because…” they arrived at the same answer, in different words, from different directions.
That answer, and what it means for how we build things, is what this blog is about.
Julia Medina is a software engineer on Zyte's research team. She co-built the first version of Web Scraping Copilot and currently researches how to feed HTML into LLMs more efficiently.
Mihaela Popova is a software engineer at Zyte. She runs a large-scale data feed in a specialized vertical, where LLMs are part of the extraction pipeline itself, not just the development process.
Neha Setia is a Senior Developer Advocate at Zyte.
Julia: “I'm researching how to use HTML more efficiently when passing it to LLMs. A common task in Web Scraping Copilot, and in lots of other places, is giving an LLM a web page and asking it to do something with it. But if the HTML is too long, it either blows the context window or the model starts forgetting important parts. So I've been building tooling to simplify HTML without removing the parts that actually matter. I'll be sharing the findings soon. It ended up being a harder task than I expected, but I think the results are useful across the company, not just for the extension.”
Mihaela: “I'm on a project where we extract obituary data for a customer. A lot of it uses LLMs. We use an LLM to extract the obituary itself, and we use one to check whether something is actually an obituary before we extract it. When we discover a new website, we use an LLM to check if it's genuinely a site in this vertical, and if it's for humans rather than pets. It turns out there are a lot of websites for pets, and the customer isn't interested in those. We also use Zyte API's extractor to gather discovery data for thousands of websites.”
The first time Mihaela described this to me, I stopped her to make sure I'd heard correctly. Thousands of sites, multiple times a day, in a vertical I'd never thought about as a web data category. It's the kind of project that reshapes your sense of what web scraping even means. And sitting next to Julia's work on HTML simplification, it made something obvious: the interesting problems in this field are no longer just about getting the data out. They're about what the data is for, and what parts of the pipeline an LLM is genuinely good at.
Julia: “When we built the first version, I wasn't even using LLMs to help me code. But the extension itself uses LLMs in a specific way. You give it a schema. It downloads some pages, converts them into fixtures, which is a test structure you can use in a Scrapy project, and generates expected outcomes for each page. Those expected outcomes are what “correct” looks like. Then it tries to generate the code to extract those values, runs it against the fixtures, and iterates. If the code doesn't produce the expected output, we tell the LLM what went wrong and ask it to fix it.
We did the initial research iteration and then handed the project to the ecosystem team to maintain. We still get pulled in whenever they need research analysis, evaluation improvements, or new features.”
The part that stuck with me from my conversation with Adrian, and that I wanted Julia to confirm from the inside, is how much of the value sits in that second use of the LLM. Everyone focuses on LLMs generating code. Fewer people talk about using an LLM to generate the expected truth, the ground-truth fixtures the generated code is measured against. That loop is the thing that lets you trust what comes out. Without it you're just vibe-coding a scraper.
Julia:“Heuristics. I have rules that contribute to the likelihood that a given node or attribute in the HTML contains useful data. Most of the time those rules work fine. If you keep the nodes with high probabilities and remove everything else, you end up with an HTML that contains the data you care about and none of the rest. But there are false positives and negatives. That's what the research is for. I have an annotated dataset generated by an LLM and revised by me, where I have marked the data that should be kept and its location, and I'm measuring how often my rules agree with the annotations and adjusting accordingly. They're not 100 percent, but they work well in a vast majority of pages, which is good enough for the use case.”
I had to laugh a little, because I heard “rule-based” and my brain went back to 2015. I did my post-graduate work around the time the industry was loudly moving past rule-based approaches into machine learning. And here we are, in the middle of the most AI-saturated moment in software, and the right answer for a problem at this scale is still a set of well-designed rules.
Julia:“That's the thing. If you need to do this for millions of pages, you can't afford tools that are too complex. So you go back to the basics.”
This is something I want every web scraping engineer to internalize. The ceiling on cleverness at scale is cost. At millions of pages, the cheapest reliable thing wins. LLMs are in the mix now, but they don't remove that constraint. They just move it.
Mihaela: “Because now I think more about design than about what I need to implement. I have to figure out exactly what I want the code to do before I start implementing it. Before, I might start implementing something and then fix it as I went. Now, because I'm relying on the tool, I need to have figured out what happens at position A, what happens at position B, and so on. It's more about design than pure coding.”
Julia:“I completely agree. For me, I spend almost no time coding now. I start by trying to understand what I want and gathering all the requirements. Sometimes I use LLMs for that too, chatting with them to boil down the requirements, check if I'm missing anything, see what approaches I could take. Then I plan how to implement it. And the implementation I'll leave entirely to an agent. I'll review it later. I still do manual QA. If I spot issues, I'll explain them to the agent and sometimes let it debug. Right now we're more like leaders. We have this junior developer, sometimes with more seniority, in an agent that's doing everything we told it to do. We're the ones guiding and reviewing that work.”
Mihaela:‘It is basically a junior developer that needs its hand held. But at the same time, it is an eager junior developer. So sometimes it writes so much code, and once you review it, it's not what you would expect. It can be simplified.”
The first time I used an agent to write something simple with Zyte API, it produced a small novel of code for what should have been a short snippet. I thought maybe I'd prompted it badly. Then I heard every developer I've spoken to describe the same pattern. This is what Mihaela and Julia are both pointing at. The bottleneck moved. The cost of writing is now close to zero. The cost of wanting the wrong thing is much higher than it used to be.
Julia:"For me, the evaluation part is the most important thing. For Web Scraping Copilot we have a fixed set of frozen HTML pages and annotated expected outputs. We run the extension, calculate how close the results are to the annotations, and that's our metric. For my HTML simplification research, I have the same kind of dataset, and I measure whether the nodes I expected to keep are still there.
For code itself, the criteria depend on what the code is for. For the extension we have metrics that favor simple code, code that doesn't branch a lot, code that doesn't have unnecessary lines. But if I'm doing a proof of concept and I just want to verify something works, I don't worry about best practices. I'll ask the agent to add more documentation if something is hard to read, but that's it. Results matter more than the code in that case.”
Mihaela:“Validation depends on the project. Mine is web scraping plus a Django platform, so we have unit tests, but there are also things I manually test to make sure the behavior is what I expect. And then I review the code. It's almost always longer than it needs to be, so part of my review is simplifying what the agent wrote.”
What Julia is describing, with the evaluation harness, is something I'd encourage any team using LLMs in production to copy. A habit of writing the ground-truth dataset before you let the model near the problem. Not because it makes the LLM better. Because it's the only way you'll know if it did.
Julia:“I think I agree on DRY. Agents and LLMs sometimes need context repeated to them, especially in large projects or long chat histories. Duplicated code can actually be easier for them to work with, because they don't have to trace an abstraction layer to another file. It's harder to maintain for humans, but if we're leaving maintenance to the agent, we can afford it.”
Mihaela:“Same. It would be easier for the AI to have less context to deal with if we duplicate. So it makes sense.”
Julia:“YAGNI gets more interesting. It used to be enforced by effort. If adding a feature took a lot of time, you naturally dropped the ones that weren't worth it. Now that adding a feature is cheap, you need real criteria for what to implement and what not to. Because we can do it, we must do it is a bad principle. We have to put a filter in ourselves that effort used to provide.”
This is the point I wish more people building with agents made. The value of YAGNI was never the principle. It was the constraint that forced it. With agents, the constraint is gone, so the principle has to become conscious. Otherwise your codebase fills with features nobody asked for, because the marginal cost of asking for them is almost zero.
Julia:“I'd go back to writing code and managing processes the way we did before. But I'd keep the planning habit. Agents taught me that having a proper plan, a real design, an implementation roadmap, makes everything downstream easier. I wasn't doing that before. I'd have a rough idea in my head and start. Now I'd keep the structure even without the agents.”
Mihaela:“I'd keep planning too. But honestly, I'd be a little scared. At this point we've gotten used to being able to produce good quality code very fast. Without agents, I'd have to readjust my expectations about speed.”
Mihaela's honesty here is what I wanted to sit with. The story we tell about AI tooling is usually about capability. What it adds. Fewer people talk about what it's already rewritten in your expectations about yourself. Speed, in particular. You can't quietly unlearn that.
Mihaela:“It might be harder. One of the first projects I had when I joined, I was given a list of websites and twenty-four hours to build a spider. That taught me a lot about how to think under pressure and what can be sacrificed to ship on time. With an LLM doing a lot of the code now, a new developer might not feel that same pressure, and I think some of that pressure is useful for building judgment.”
Julia:“Building judgment is harder, but still possible. A lot of it comes from seeing many projects with many different challenges. Problems you haven't experienced firsthand are harder to imagine as possibilities. The learning process will just go differently now. New developers will use LLMs for most tasks. The important thing is understanding where those tools fall short, and the bits you still have to do manually. Anti-ban measures, scaling, the dynamics of websites trying to be more agent-friendly. Web scraping has so many surfaces that keep changing. Anyone who enjoys that kind of challenge will do well in this field.”
This is the question I keep coming back to. Every senior engineer I've spoken to built their judgment on some version of the twenty-four-hour spider. I can't tell if that kind of trial will survive the next generation of tools. What I do think, listening to Julia, is that the surface area of web scraping is the thing that saves it. There's always another hard layer the LLM can't handle cleanly. New developers will meet their own twenty-four-hour spider. It just won't look like ours.
Mihaela:“Design. Thinking about what we're going to implement.”
Julia:“Fundamentals. Understanding what you're building, how you want to build it, and your criteria for success.”
I notice this pair of answers is already the blog. Everything we talked about, the evaluation datasets, the planning habit, the honest judgment about when to simplify, when to add tests, when to trust results over code, it all sits on top of those two words. Design. Fundamentals.
The day I can delegate those to an agent is the day this job genuinely changes. I don't think that day is soon. And even if it arrives, I suspect someone will still have to tell the agent what good looks like. That's still the design. That's still fundamentals.
Mihaela:“I'd open a bar. And probably try to write a romance novel. I've been trying since high school, which is fifteen years now. I have new ideas every few months. I just haven't finished one yet. But the cocktails I can do. Sex on the Beach is my best one. People at parties tell me mine is better than what they get in bars, and I believe them.”
Julia:“Something without computers. Electrician, plumber. But if I was dreaming, I'd have a bakery. I'd love to be around pastries and have a customer-facing job, chatting with people. That's what I'd actually want.”
I asked this as a joke. They answered for real. Mihaela's Sex on the Beach is apparently better than most bars. Julia wants a bakery. We laughed for a while. And I sat there thinking how much of this work is heads-down at a desk, and how fast the humans show up the moment you ask about anything else.
Most of my interviews in this series have been with men. Not by design, just by who's around. I wanted to do one with two women for a change, and Julia and Mihaela were the obvious choice. They're both excellent at what they do, and their perspectives on LLMs are some of the sharpest I've heard. That's really all I want to say about it.
The full recorded conversation is coming soon. Julia and Mihaela will be back for the video version, and we'll go deeper on a few of the questions that opened up here, including the one I couldn't let go of: why not delegate the design itself to the agent?
Until then, the short answer is the one Mihaela and Julia already gave. Because something still has to know what good looks like.
Try Web Scraping Copilot, the VS Code extension Julia worked on.
Read Adrian Chaves's interview about the same extension here.
Join the Extract Data Discord and bring a question Julia or Mihaela didn't answer. I'll pass it on for the recorded follow-up.
Start a free trial of Zyte API if you want to try the extractor Mihaela uses.