Statistics: the why behind the what [post 41/100]

It’s all about the numbers, right? This is how we are supposed to make decisions or justify them, how we evaluate our success, how we understand the world around us. Everywhere you look these days, there’s sexy new infographics showing, e.g., which percentage of us are jealous of our parents’ lifestyles, or prefer 70s retro to 80s retro, or open marketing emails from fashion brands we’ve never heard of. Big Data remains the Big Thing we’re all meant to have, or want, or worship.

I have a deep and healthy respect for science, and so I am a fan of evidence-based decision making. But there’s a problem with the way we’re doing it these days – we don’t necessarily know what the evidence means, but we’re making decisions based on it anyway.

Last week at MIPTV, there was (as usual) a lot of talk about people’s short attention spans. Bert Baker said, ‘People get bored in 2.5 minutes unless something crazy happens.’ And Rene Rechtman of Maker Studios claimed that ‘87% of Netflix viewing sessions are under 10 minutes.’

87%.

Even if that’s true, of which I am not entirely convinced, what does it mean? Some people seem to be citing these numbers as support for more and more shorter-form (and usually digital-first) content. But what about the other statistics – Peter Jackson’s Tolkien trilogies have been among the biggest box office successes in history and the shortest of them clocked in at 144 minutes (the longest was over 200). I’ve seen them all and I’m pretty sure that well over 2.5 minutes went by without something crazy happening, a fair few times.

And what about all the binge watching on Netflix? As of the end of 2013, over 60% of its users were watching 2-6 episodes of a programme in a single sitting.

What I’m getting at – not for the first time – is that knowing what happened is not particularly useful unless you know why it happened. For example, what if (as we posited during the closing panel) those 87% of sessions on Netflix are simply reflecting that platform’s version of channel surfing? When you channel-surf, do you watch every channel for 10 minutes or more? Of course not. And in the first round of examples, maybe people aren’t really opening those marketing emails, it’s just that their email clients default to preview mode and download the graphics.

If you don’t know why people are doing a thing, you don’t know how best to respond. Any academic will tell you that if you’ve got enough data, you can find statistics to support pretty much any hypothesis you care to concoct – but that doesn’t mean you’ll be right, or that the action you take based on that hypothesis will have the desired outcome.

Getting to the why shows us not only what’s happened but what we can do about it. And understanding the why usually requires observing the people engaging in the behaviour. I’m not saying this can’t be done in a statistically relevant way – I’m simply suggesting that a single number only shows one facet of behaviour, and one facet can be as misleading as it is fascinating.