Spreading around the ol’ blog-o-sphere is Google’s Flu Trends. In essence:
We have found a close relationship between how many people search for flu-related topics and how many people actually have flu symptoms. Of course, not every person who searches for "flu" is actually sick, but a pattern emerges when all the flu-related search queries from each state and region are added together.
Pretty cool. Really cool actually. What Google is doing here is combining a couple of dimensions of data: where you are geographically and flu-related search queries to determine how the flu virus is spreading.
A few years ago, I talked about Google Base and what I then called the “Incidental Semantic Web.” In short, what I was talking about then is how the monitoring or mining of data artifacts out of selfish behavior can lead to some really interesting insights. That’s exactly what’s happening here.
We search for flu remedies and symptoms on Google because we’re selfishly motivated to learn more. We have no interest in contributing to some database that reports on the spread of the flu. That selfish motivation is precisely why Google can trust (relatively speaking) the data coming in. In other words, this ability to predict the spread of the flu is an incidental byproduct of millions of discrete, selfish acts.
I think this is just the beginning. Imagine synthesizing results from not only search queries, but eating habits (via “smart” refrigerators), drug interactions (RFID is making its way onto prescription bottles) and many other "sources” of data. For example, imagine finding a a far less likelihood of diabetes in cultures that eat extraordinary amounts of cauliflower.
Today, we take a guess about the correlation of different factors (Vitamin X reduces Disease Y) and kick off a study where we then decide to watch and gather data. Tomorrow, we’ll just check the data that comes out of our everyday lives.