Sunday, February 03, 2013

Data Mining isn't Like Looking for Gold

What's wrong with comparing data mining to gold mining?

Fire River Gold in Alaska goes through about a half a ton of dirt at the Nixon Fork mine to produce enough gold for just one ring.  The productivity of gold mines is measured in GPT - Grams per Tonne - and the whole process is focused on maximizing productivity as efficiently as possible because gold mining comes with a very high fixed-cost business model.  

Working with data, either digital or transactional, has been related to gold mining since a lot of raw material has to be processed and we're looking for nuggets.  But the similarity stops there - in mining and prospecting for gold, we know precisely what we're looking for and are trying to eliminate areas with low value or probability as quickly as possible.  In data mining the converse is true, we're trying to find the smaller pockets of insights that can be leveraged in our relationship with consumers.

Data mining creates gold, it does not find it. 

(Actually it creates understanding, and that is much more scalable than creating a mountain.)

And the bigger difference is not in the volume or even variety of data, but rather in the process that turns data into marketing recommendations.   In gold mining the excavation, smelting processes removes all the impurities and uncertainty until we are left with a known result -- Au.

Marketing analytics is quite different since we are adding uncertainty along the way to create those useful nuggets of understanding.  Using digital data as an example, we have...
  • the logs of actions and we need to classify them into relevant business events, e.g. turn a click into a publication view.  But there remain anomalies like bounces, uncompleted events, etc. that require business rules to clean up.
  • a variety of methods that allows us to guess at a visitor; certainly credentials (email) or recognition (cookies) makes this easier but there remains a gap between inference and knowledge.
  • uncertainty about the value of what we may find.  The ROI of gold can be computed before hand (grams produced * $pg - less cost of production).  Data mining has no spreadsheet or planning formula.
So, data-driven marketers should think of data mining as an on-going business process rather than as a discrete manufacturing site.

No comments: