This is a big data resource, and more. Check out the reaction to the bad anonymization here.
20GB of uncompressed data comprising more than 173 million individual trips. Each trip record includes the pickup and dropoff location and time, anonymized hack licence number and medallion number (i.e. the taxi’s unique id number, 3F38, in my photo above), and other metadata.
Before the link to the data, here’s an analysis based on similar data:
Why New Yorkers Can’t Find a Taxi When It Rains
Eric Jaffe | City Lab Blog
October 20, 2014
Provides a nice synopsis of some research using taxi cab rides. Read it for the links to the formal research papers.
New York City Taxi Cab Trips [in small chunks]
FOILing NYC’s Taxi Trip Data
Chris Whong | personal website of an Urbanist, Mapmaker, Data Junkie
March 18, 2014
a synopsis of how he got the data via a FOIA request & a link to the data on rides/fares as single files, instead of the chunked version above.
and the story about how the taxicab medallion IDs were improperly anonymized:
Poorly anonymized logs reveal NYC cab drivers’ detailed whereabouts
Dan Goodin | ars technica
June 23, 2014
On Taxis and Rainbows: Lessons from NYC’s improperly anonymized taxi logs
Vijay Pandurangan | Medium blog