the two reading pieces I have sitting on my desk are this Origin of Wealth and the latest Wired magazine. interesting to be reading them simultaneously ...
"Scientists are trained to recognize that correlation is not causation, that no conclusions should be drawn simply on the basis of correlation between X and Y (it could just be coincidence). Instead, you must understand the underlying mechanisms that connect the two. Once you have a model, you can connect the data sets with confidence. Data without a model is just noise.
But faced with massive data, this approach to science - hypothesize, model, test - is becoming obsolete.
There is now a better way. Petabytes allow us to say: 'Correlation is enough.' We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot."
From Origin of Wealth:
"... not only is there a problem with data that contradicts Traditional [Economic] theories, but many theories have simply never been properly tested. One branch of economics, called econometrics, deals with data analysis. Rather than testing theoretical models, however, much econometric work is devoted to finding statistical relationships between variables (often for public policy or other applied purposes). Unfortunately, statistical correlations don't provide a causal explanation of the phenomena. Furthermore, as many economists would point out, there is often a lack of readily available data to test theories with, and even data that is available is frequently noisy or otherwise problematic."
should be an interesting week as these seemingly conflicting ideas bounce around in my head.