As marketers we've been conditioned to do the following:
- To ask only really important questions because the half life of the others is much shorter than the time it takes to get answers.
- To view most data as having little value because we can't adequately see or filter it into anything useful.
- To accept answers to questions that technology can support based on yesterday's needs rather than what the business requires today.
An unintended consequence was that we tended to focus on structured data - things that could be represented in rows and columns like customer transactions - rather than the amorphous data of comments and images whose value was unclear. Because the marginal cost of implementing changes to data and reports was high we were left with a situation where the addition of any new data source or question would be delayed by the process of getting funding.
Well, that approach won't work anymore.
Digital interaction spins off data in real time that we have to leverage in real time if we are to respond to consumer interests and actions. Businesses need to consider that...
- Self-expression leaves a myriad of opinions and objects to be shared
- The infinite paths-to-purchase leave breadcrumbs at each and every step
- Human interaction influences opinion and choice more than the paid placement of messages
In the past we could plan, execute and track; today we have to track, execute and plan.
It would probably be insane to try to implement a consolidated system that captured and stored all of those events with a customer loyalty program. So, if we can't build an uber data warehouse, what are we to do? Well the folks at Google and Yahoo! solved that problem by distributing the question rather than centralizing the data in order to improve the indexing of web sites for search. And out of that comes Hadoop a framework for data-intensive applications that leverages distributed processing designed to handle the type of data generated above.
Consider the simple idea of presenting a customer with the next best product while she browses your web site. This requires he tight integration of real time events (what are you looking at), historic transaction data (what have you bought) and predictive analytics (what should we recommend).
Hadoop is not a replacement for existing infrastructure but represents a way to think about three key marketing needs.
- How do we transform digital self expression in order to leverage it with historic transactions?
- How do we do scale real time, event based analysis and deliver them where the consumer is right now?
- How can we overlay content consumption habits with our traditional segmentation schemes to deliver a better experience?