Data-mining for terrorists doesn’t work

From Ben Goldacre’s Bad Science, 2009: Data-mining for terrorists would be lovely if it worked.

If you have 10 people, and you know that 1 is a suspect, and you assess them all with this test, then you will correctly get your one true positive and – on average – 1 false positive. If you have 100 people, and you know that 1 is a suspect, you will get your one true positive and, on average, 10 false positives. If you’re looking for one suspect among 1000 people, you will get your suspect, and 100 false positives. Once your false positives begin to dwarf your true positives, a positive result from the test becomes pretty unhelpful.

Remember this is a screening tool, for assessing dodgy behaviour, spotting dodgy patterns, in a general population. We are invited to accept that everybody’s data will be surveyed and processed, because MI5 have clever algorithms to identify people who were never previously suspected. There are 60 million people in the UK, with, let’s say, 10,000 true suspects. Using your unrealistically accurate imaginary screening test, you get 6 million false positives. At the same time, of your 10,000 true suspects, you miss 2,000.

This is simply unworkable. But that probably won’t stop governments from spending billions and invading the privacy of all of us trying to do the impossible.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: