Field notes from the world of data extraction.

Articles, interviews and analysis on how data is gathered, used and fought over — written by the people closest to it.

⌕

The rise of web data in hedge fund decision making

How To Leverage Alternative Data In Asset Management

As big data continues to disrupt the investment research market, learn how harnessing the power of web extracted alternative data can give you an edge.

Marie Moynihan4 min readJanuary 9, 2020

Proxies

Backconnect proxies explained: How to use them in a scraping project?

Scaling up your web scraping project without a reliable backconnect rotating proxy service is difficult. Learn how a backconnect proxy works and when to use it.

Attila Toth4 min readDecember 10, 2019

Use case

4 Sectors That Benefited Most From Business Intelligence Software

Business Intelligence software can transform data into gold. Learn more about the various use cases where a business intelligence software can be leveraged.

Himanshi Bhatt5 min readDecember 4, 2019

Open-source

How to use Zyte Smart Proxy Manager with Scrapy

Zyte Smart Proxy Manager is specifically designed for web scraping. In this article, learn how to use Zyte Smart Proxy Manager, inside your Scrapy spider.

Attila Toth2 min readNovember 14, 2019

Building Spiders Made Easy | GUI For Scrapy Shell

Use case

Scrapy, Matplotlib, MySQL: Real Estate Data Analysis

Extract real estate data from one of the biggest real estate sites and then analyze the data using Python, Matplotlib and MySQL.

Attila Toth7 min readNovember 7, 2019

From the creators of Scrapy: Automatic data extraction API

Announcement

Web scraping and how to leverage machine learning

Extract Summit Q&A part 2. Read this post to get answers on web scraping infrastructure and how machine learning can be used in web scraping.

Himanshi Bhatt5 min readOctober 17, 2019

Proxy management: In-house or off-the-shelf proxy solutions?

Announcement

QA: Web scraping at scale, anti-ban and legal compliance

We gathered the questions on web scraping and data extraction at the Extract Summit. Read this blog to get answers on bans, proxies or GDPR in web scraping.

Himanshi Bhatt5 min readOctober 10, 2019

Use case

Price Intelligence With Python: Scrapy, SQL, And Pandas

Smart Price intelligence for retailers is becoming increasingly important. In this article we will extract products data then try to get insights out of it.

Attila Toth7 min readOctober 8, 2019

Developer interest

Summary: The Web Data Extraction Summit 2019

Zyte hosted the first-ever event dedicated to web scraping and data extraction - The Extract Summit. Read in our articles who joined and what we discussed.

Himanshi Bhatt4 min readSeptember 26, 2019

Use case

Get News data extraction at scale | Zyte Automatic Extract

In this article, we take you through how to extract data from two popular news sites and perform basic exploratory analysis. Best thing: You can try it for free!

Attila Toth6 min readSeptember 17, 2019

Use case

Gain a competitive edge with product data

Web extracted product data can help companies gain a competitive advantage through brand and price monitoring and implement MAP compliance. Read more about it.

Himanshi Bhatt3 min readSeptember 12, 2019

Use case

Four popular use cases for online public sentiment data

Online public sentiment analysis can transform public emotions into quantitative insight that can be used to drive change. Look at the applications here.

Himanshi Bhatt2 min readSeptember 5, 2019