Juking the stats

Never underestimate people's desire to cover their asses.

I finished reading Super Crunchers, a provocative book by Ian Ayres. He argues that improvements in statistical techniques are transforming the ways that we make decisions in fields ranging from business to medicine to government policy.

Nowadays we have access to much more data, thanks to technological enhancements such as cheaper data storage and online survey tools, and with all that data to "crunch," computers can often arrive at judgments faster and more accurately than human experts can. As a result, statistical models have recently been branching out into new, unfamiliar terrain. They’re diagnosing illnesses better than doctors and predicting the quality of wine better than wine critics, and in doing so, Ayres argues, they’re also diminishing the authority of experts and the reliability of intuition rooted in life experience.

A particularly bizarre example from the book is how one company is using the characteristics of Hollywood scripts — for example, how many production sites the film has, or how many big-name actors are in the cast — to predict which movies will be blockbusters. Apparently, their model is beating out the studios’ own predictions.

Another somewhat discomfiting trend is how companies are collecting data on individual customers — for instance, by tracking purchases made using those "reward" cards you carry on your keychain — and then using that information to figure out not only what sells and what doesn’t but also how profitable a customer you are. The company can then turn around and target promotions at the spendthrifts and not the coupon clippers.

My biggest problem with the book is that it doesn’t spend enough time talking about the limitations of these statistical models. It’s important to consider carefully how the data are collected and whether you can actually measure what you’re trying to measure. There’s plenty of debate in social science circles about these topics, but for an opposing, non-academic viewpoint you might turn to the third and fourth seasons of The Wire. The show’s creators, David Simon and Ed Burns, savage the growing popularity of using statistics to track the performance of schools (No Child Left Behind), police departments (CompStat), and other institutions. The people working for those institutions want to keep their jobs, they argue, so what happens is everyone starts juking the stats: downgrading aggravated assaults to lesser crimes, marking students down as proficient when they’re way below grade level.

As far as the war on drugs is concerned, jacking up up the arrest numbers will also not get at the root causes, Simon and Burns suggest, because the drug dealers just get better at operating outside the spotlight. Shutting down their networks requires a kind of police work that’s more subtle and involved than just rounding up bodies, but that strategy gets the short shrift in an environment that prizes quantity, not quality. In other words, the stats we’re using to evaluate success may not be the right ones to be measuring.

As "data-based decision making" becomes more popular, expect a sharp increase in the fudging of data and the political maneuvering on behalf of self-serving measurements of performance. Never underestimate people’s desire to cover their asses.

Victor Tan Chen is In The Fray's editor in chief and the author of Cut Loose: Jobless and Hopeless in an Unfair Economy. Site: victortanchen.com | Facebook | Twitter: @victortanchen