4 days ago
Sunday, February 22, 2015
The Big, the Rich, and the Good (Data)
Nate Silver has written a fascinating article about the stellar success and rapid progress of baseball analytics versus the less-than-rapid progress of data analytics in other areas (e.g., economic and earthquake forecasts).
Putting the baseball angle aside (and of course I like that angle very much), one thing I really like is the very concise comments about "big data" versus what Silver calls "rich data" and why sports analytics have genuinely improved answers to many important-for-it questions, whereas other situations still struggle immensely to use their data for genuinely large increases in understanding.
Note that I have sometimes used the term "good data" before as a contrast to "big data", though importantly good data can be either big or small, and I think that Silver is thinking of Rich Data strictly as a subset of Big Data. See this blog entry of mine as well as additional blog entries referenced therein.
One could also, I suppose, ask whether these other systems are "more complex" than sports competitions, but I'm not sure (a) whether that's actually true and (b) how to quantify it in a way that goes beyond "Ooh, it's complex." Of course, we have measures of information content, but I expect there would be a lot of assumptions involved in crunching such numbers in these cases. Anyway, it's a thought I had, so I figured that I should at least bring it up.
Labels:
articles,
baseball,
Big Data,
complex systems,
data,
data analytics,
data mining,
data science,
sports
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment