In today’s digital world, news sources are abundant throughout the world, available at the fingertips of readers and even readers have become publishers themselves. The way people consume news is changing with users seeking alternatives to noise, privacy issues, and social media. There is a demand for information that fits their interests and for trusted journalism. According to the Digital News Report 2019, 41% of readers have checked accuracy by comparing multiple sources and 24% have said they stopped using news sources with poor reputations altogether.
With more and more people having access to the news through smart devices and hunger for personalized news content, there is a problem in categorizing the colossal amount of news data on a daily basis.
Kinzen is a technology company that helps readers engage with the publishers who inform, inspire and empower them. They build tools for individuals and publishers to access and present personalized and trusted news. Providing structured news data for personalized feeds, backed by artificial intelligence and algorithms, that will be based on a user's preferences and what experts are interested in.
Using multiple APIs that focus on user signals, Kinzen powers newsletters on behalf of publishers that provide impressive personalization between publisher and reader. Giving readers the content they want and helping publishers better connect with their subscribers.
In this case study, learn how Kinzen partnered with Zyte to become a crucial part of the building and maintaining of their news data pipeline using Zyte's Automatic Extraction. Zyte's recently launched Automatic Extraction provides customers with AI-enabled, automated web data extraction at scale. Using machine learning, Automatic Extraction can extract millions of news articles at a scale in a fraction of time it would take a developer to do manually.
Due to the nature of its business, Kinzen needs to deliver quality sources of information that the reader can trust and help their publisher partners better connect with their readers. This requires gathering a lot of news data from thousands of different sources across the web. It’s the Data Engineers in Kinzen who are responsible for maintaining quality data pipelines for their APIs to run effectively. Kinzen must source millions of news data sources daily, which requires extracting the world’s news data accurately and reliably.
To have their product and APIs to be successful, getting quick access to news data was paramount for Kinzen. Therefore the key challenge facing them was speed and scaling. Also, they had to consider the list of sources in their directory is constantly growing and evolving. News publishers’ websites today are constantly updated in real-time and a wide range of measures are needed to wrangle the data into a consistent uniform format.
To do this, Kinzen had a choice - Hire a dedicated team of web scraping experts internally, or seek out a third-party provider.
After considering the costs of hiring an internal team, including training, on-boarding, and setup, they decided on the latter and began to assess the types of businesses that could provide the data extraction capability they needed. Zyte was the clear choice based on our ability to provide data extraction on a grand scale, accessibility to the data, the quality of the data, and the speed at which the data was received.
Zyte provided Kinzen with an easily manageable solution where they were able to identify sources and maintain a consistently high level of data quality when collecting news data. This also allowed Kinzen to be able to precisely tweak the kind of data they required at scale efficiently and effectively.
With Zyte, Kinzen was immediately able to scale their data extraction efforts to match the quantity of news content produced daily. Providing their data engineers with millions of extracted news articles for their API’s to process.
By offloading the collection and maintenance of news data to Automatic Extracion, it allowed Kinzen’s team to focus on product development and business strategy. The provision of such data has allowed Kinzen to accelerate their product development process without worrying about the maintenance of their data pipeline. This has enabled them to become one of the most innovative technology company’s today.
Using Zyte Automatic Extraction dews data APIS - Kinzen are able to reduce the refresh time of all their publisher partner's latest content to 1 minute. So any amendments or removal of content on the publisher’s website, Kinzen’s content capture would be accurate within 1 minute of publication. Allowing Kinzen to give their publishing partners more breathing room and give readers the most up-to-date newsletters.
An external provider with expertise in data extraction—Zyte's Automatic Extraction API. Providing data extraction on a grand scale, accessibility to the data, quality of the data, and at a speed that Kinzen needed.