How to allocate sales to tactics?
A while back Kevin Hillstrom posted a challenge to figure out how multiple tactics worked in a multi-channel world. A set of catalog and online sales data was provided for 10,000 customers along with which marketing tactics they received, e.g. catalog, email, postcard, etc. He recently posted an answer.
The most important conclusion is that influence is multi channel. Just like "advertising drives search"
Some thoughts on the solution provided by Grigorios Tsoumakas.
1. It is based on the correct premise that the problem is one of allocation; not prediction. This makes it both simpler to create an answer as well as to explain. "Sales are allocated to individual tactics in accordance to their influence across all possible combinations of tactics."
2. The solution makes an assumption that one tactic was dependent on a previous one. In this case the second email was assumed to be delivered to the same customers as email 1. The data suggest that this is valid since everyone who received email 2 also received email 1.
3. Organic sales were those among customers who received no marketing. Simple but probably understated. It is close to saying that the level of sales would be $2000 if advertising was withheld. "Unattributed" might be a better label in this case.
4. This seems to be along the lines of an 'observed vs expected' problem. I wonder if there is an extension there to include costs to get to financial effectiveness.
5. No catalog sales were attributed to email. Given the large amount of sales that are attributed to the catalog, I'd 'expect' some. But the numbers don't lie.
So what would regression have told us? (And yes I was one of the 500 who downloaded the spreadsheet.)
1. It is fairly easy to distinguish between online and catalog buyers. If they shopped in one channel, they weren't very likely to shop in the other.
2. A large number of customers did not purchase in the month of interest, but were marketed to. Inclusion of these '0' sales will bias most models tremendously. So, should we look at only those sales that occurred in a given channel and only include the customers who made them? "Yes" since the problem is one of allocation, not prediction.
3. Mathematically, the role of the 'catalog' was interesting after segmenting the data. For online sales the presence of a catalog depressed sales -- okay, you're expecting that. For catalog sales, the presence of a catalog also depressed sales --- welcome to the vagaries of regression.
The math is about minimizing the error in the model, not about explaining it to your boss.