Cautionary Tale about Big Data Sampling

Is the Sample Good Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose
F. Martatter, J. Pfeffer, H Liu, K. Carley | arXiv.org
June 2013
[Abstract] [Paper]

These authors compare metrics based on the data one gets from Twitter’s free API vs the full universe (Firehose) and samples drawn from the Firehose. And, as an added bonus there is an excellent supply of references in this emerging field of big data/real-time data.

0 Responses to “Cautionary Tale about Big Data Sampling”


Comments are currently closed.