Wednesday, February 07, 2007

Advertising Effectiveness

How to allocate sales to tactics?

A while back Kevin Hillstrom posted a challenge to figure out how multiple tactics worked in a multi-channel world. A set of catalog and online sales data was provided for 10,000 customers along with which marketing tactics they received, e.g. catalog, email, postcard, etc. He recently posted an answer.

The most important conclusion is that influence is multi channel. Just like "advertising drives search"

Some thoughts on the solution provided by Grigorios Tsoumakas.
1. It is based on the correct premise that the problem is one of allocation; not prediction. This makes it both simpler to create an answer as well as to explain. "Sales are allocated to individual tactics in accordance to their influence across all possible combinations of tactics."

2. The solution makes an assumption that one tactic was dependent on a previous one. In this case the second email was assumed to be delivered to the same customers as email 1. The data suggest that this is valid since everyone who received email 2 also received email 1.

3. Organic sales were those among customers who received no marketing. Simple but probably understated. It is close to saying that the level of sales would be $2000 if advertising was withheld. "Unattributed" might be a better label in this case.

4. This seems to be along the lines of an 'observed vs expected' problem. I wonder if there is an extension there to include costs to get to financial effectiveness.

5. No catalog sales were attributed to email. Given the large amount of sales that are attributed to the catalog, I'd 'expect' some. But the numbers don't lie.

So what would regression have told us? (And yes I was one of the 500 who downloaded the spreadsheet.)

1. It is fairly easy to distinguish between online and catalog buyers. If they shopped in one channel, they weren't very likely to shop in the other.

2. A large number of customers did not purchase in the month of interest, but were marketed to. Inclusion of these '0' sales will bias most models tremendously. So, should we look at only those sales that occurred in a given channel and only include the customers who made them? "Yes" since the problem is one of allocation, not prediction.

3. Mathematically, the role of the 'catalog' was interesting after segmenting the data. For online sales the presence of a catalog depressed sales -- okay, you're expecting that. For catalog sales, the presence of a catalog also depressed sales --- welcome to the vagaries of regression.

The math is about minimizing the error in the model, not about explaining it to your boss.

Kevin Hillstrom said...

Thank you for referencing the challenge I gave to my readers.

There were some interesting responses from folks who attempted to allocate sales to the four marketing activities, and to 'organic' sales ... sales that occur without being caused by advertising.

For the most part, catalog sales could not happen 'organically' --- you have to mail a catalog to somebody to get the customer to order via a 1-800 number.

However, an e-mail or a postcard can remind a person to look at their catalog, causing the customer to purchase via a 1-800 number. So there can be interactions between marketing activities.

The dataset was designed to have odd things happen when using regression (as you noted).

I received a response where the analyst told me that executives should stop trying to allocate the sales to each activity, since mathematically, there are too many problems with multicollinearity.

As we all know, while the analyst may be right, the executive still needs the question answered to her specifications. Therefore, my industry needs to create the tools necessary to do this right. The answer that I published is a good start.