Friday, September 29, 2006

Type Casting Errors

What's in a phrase?

A recent post in the Direct Newsline recapped a press release from Lyris saying that email "False Positives" were still a problem. While the overall point is there is too much junk mail, I wondered just what exactly does this mean?

Had to go dust off a few things to figure that out.

Consider the question: “Is this email spam?” If this statement is true, then I will block the email.

Reality is either True or False in that the email either is (true) or isn't (false) spam; however, the test could be wrong in one of two ways.

  • I let it pass when I should have blocked it. This is False Negative- a spam email I have to suffer with.
  • I block it when I should have let it through. This is False Positive- a good email lost.


Note that the question could be asked the other way: “Is this email safe to pass on?” In this case ‘truth’ is associated with passing it along.

  • I let it pass when I should have blocked it. False Positive- more junk email.
  • I block it when I should have let it through. False Negative- less requested email.

So,

False Positive means saying yes when you should have said no.

False Negative means saying no when you should have said yes.


Confusion over a phrase like "False Postives" arises because it depends on whether the question is either "is this spam" or "is this from a requested sender." In the first case higher false positive rates relates to loosing good email in the second it means more spam.

The consequences of being wrong are often very different depending on which side of the question you land. The American judicial system knows this well -- the emphasis is to err on the side of innocence not guilt. Consider business email vs. personal email at home where the need for filtering may be different. At work, I'll take a little garbage to make sure I get all of what I need. At home, I'm more likely to eliminate all garbage even at the risk of losing something that might be of interest.

This leaves us with two specific questions with respect to managing the cases where we are wrong.

1. What specific question are we answering?

2. Which type of error is more critical to the business and customer relationship? Blocking too much or allowing too much?

Both questions are important and should be considered in deciding how to filter email as well as any other activity where decision rules are put in place.

No comments: