DebunkEU.org is using Zyte’s Automatic Extraction service to monitor and expose disinformation campaigns spread across media outlets in the Baltic region and further afield. To achieve this Debunk EU is currently scraping news-based websites worldwide in over 40 languages including Russian, Chinese, Iranian, Arabic, German, French, Ukrainian, Georgian, Balkan and Baltic languages. With the help of our easy-to-use Automatic Extraction API - plus friendly technical support from the Zyte team - Debunk EU is scraping around 1.5 million news articles every month from thousands of news sources.
DebunkEU.org is an independently-funded think tank and non-governmental organization that tracks disinformation and misinformation campaigns across media outlets in Baltic countries and Poland, as well as in the United States and North Macedonia.
Its team of over 50 analysts and active volunteers conducts detailed fact-checking and research into disinformation concerns in the Baltic countries and Poland. The think-tank reports on topics including misinformation about COVID-19 and vaccines, political turmoil in Belarus and Russia, and attempts to target NATO activities.
Debunk EU publishes over 100 reports per year, and also runs a programme of educational media literacy campaigns. It also works closely with national institutions in partner countries that provide more valuable insights on the situation in the Baltics.
Learn about DebunkEU.org >>>
Debunk EU aims to counter disinformation and information campaigns, with the goal of providing insights into complex issues in a concise, understandable and informative way.
From 2017, Debunk EU started exploring the options for collecting news articles from various sources. “At that time all the commercial options were really expensive, so we developed our own extraction solution based on Scrapy” explains Debunk EU CTO Girius Merkys. “It was OK, but we had something like 200 domains to monitor and it required a lot of maintenance.”
As time passed, Debunk EU faced the growing challenge of monitoring more and more domains. “Some small countries that we’re interested in might have over a thousand news outlets” states Girius. “In the disinformation space it’s common to see lots of simple Wordpress-based websites controlled by one entity, all running the same story to give the impression that ‘it must be true’”.
Girius also notes that the process of debunking false or misleading content online can be both costly and time consuming. “It’s difficult to fact-check a piece of information if you do not know where to start. What’s more, debunking disinformation costs way more than creating it.”
To deal with the rapidly-growing scale and complexity of extracting millions of news articles, Debunk EU approached Zyte to provide a cost-effective and easy to use automated article extraction solution that would minimize development overheads for the busy Debunk EU team.
With the help of Zyte’s Automatic Extraction API, Debunk EU is able to track the evolution of disinformation campaigns by monitoring over 1.5 million online articles every month.
“As we’ve scaled up we didn’t want the hassle of having to keep maintaining Scrapy” says Girius. “Also, because we are a non-commercial NGO we needed an affordable solution – and that’s something Zyte has been able to offer us, plus technical assistance because of the sheer volume of requests we have every month.”
As well as the quality and reliability of article extraction, Girius also welcomes the efficient support offered by the Zyte team: “We’re very happy with the help we get. Without it we wouldn’t be able to do our work and publish more than 100 reports every year. I really like the article list service. It really just makes everything much easier for us. We just give the link of the domain, then we get the article list and we just scrape it with your API. It’s automatic and it’s really convenient.”
With help from Zyte’s Automatic Extraction API Debunk EU is able to access millions of news articles every year – with the capacity to grow smoothly as it monitors a greater range of media outlets in more territories.