Cautionary Tale about Big Data Sampling

Is the Sample Good Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose
F. Martatter, J. Pfeffer, H Liu, K. Carley |
June 2013
[Abstract] [Paper]

These authors compare metrics based on the data one gets from Twitter’s free API vs the full universe (Firehose) and samples drawn from the Firehose. And, as an added bonus there is an excellent supply of references in this emerging field of big data/real-time data.

Comments are currently closed.