The New York Times

New York Times Archived News Search

The New York Times has a vast collection of archived news assets dating back to early 19th century. The Archived News project was intended to leverage search technology to monetize these assets by making archived news content readily accessible to historians and researchers looking for archived news stories.

In addition to categorizing news articles by publication date and news column names, FAST ESP was used to process this vast volume of content and automatically extract searchable entities within the news articles such as topic keywords, person names, company names, location names etc.

Users can now search this archived news collection by any given keyword and obtain a chronologically organized list of past headlines that are now part of history.

An additional technical objective of this project was to be able to service over 120 queries per second (430,000 queries per hour) while utilizing the smallest possible server footprint.