Late last year I was sitting in Osaka in a little bar in Dotonbori, watching the presidential election results come through. The television was tuned to the Japanese national broadcaster NHK and from time to time different results would come through. Obama kept piling up wins across the country until at around 1:30pm Japanese time CNN called the election for Obama. This result, like most of the earlier Electoral College results, was met with a muted response. A mild round of polite applause and then back to eating, drinking and text messaging friends. The relative indifference to the result couldn’t have been further from the hype my friends were experiencing first hand as they watched in New York and San Francisco. But as I watched the results my emotional levels mirrored that of the Japanese salaryman next to me in the bar – I felt very little in the way of anticipation or surprise. I had followed Nate Silver’s models over the past 6 months and as the Election Day drew closer and closer the models became more and more accurate. Checking in to the website fivethirtyeight.com on the Friday before the election and it was clear from the data and models that Silver had built that Obama would have an estimated 86% chance of winning. As the results came through, the models proved to be correct — there was simply nothing to be surprised about. It was the political equivalent of being up by 6 runs at the top of the 9th at the Giants baseball game — time to go home and beat the traffic.
The truth is that we are getting very good at predicting things like elections and the closer we get to the election day, the more accurate the prediction we can make. The models that power these predictions have been becoming increasingly more complex over the past decade and for someone like Nate Silver this improvement process has involved thousands of hours of work to get to what could be considered a v3.0 of his election model. Add to this an exponential growth in available data and on demand distributed cloud computing engines and the predictive power of these models start to become impressive. Three days out from the presidential election in the United States we can predict the likelihood* of either candidate winning — we have the technology. Get over it. Predicting election results 72 hours out will be as routine** as predicting the weather — and just as with predicting the weather, we will get much better with it over time. It’s fair to say that with electoral prediction we are only at the beginning of this upward facing curve.
For those of us in the data science field, the ability to predict election results comes as no surprise. As a group we spend our days trying to predict everything from which ads a Facebook user will click on, to which way the stock market will move in the next 1/10th of a second to the perfect movie to watch on a Wednesday evening. Which is to say nothing of our collective efforts to predict the evolution of conflicts, the movement of people on a daily basis and the outbreak of new viruses. Over the last 10 years we have got very good at predicting human behavior. Over the next 10 years the rest of the world will realize the implications of this — and a lack of election surprises is one of them.
Which brings us to the first realization; short-term prediction is in many ways a cheap (albeit impressive) statistical trick. Kind of like counting cards at a blackjack table. In predicting the election result you are basically using a sampling technique to measure preferences of the population, blending the results (from different polls) and accounting for systematic biases in the final weighted result. Indeed Nate Silver was far from being the only person to do this, in fact many different statisticians claimed similar predictive results using similar techniques. But what if instead of predicting that Mitt Romney would lose the election you were able to use data, algorithms and computational models to engineer a different result. What if you could use data to change the course of the election? You see it is one thing to tell the Romney camp that they are going to lose; it is another thing to tell them what to do to manipulate public opinion that would allow them to win. Don’t like the way that an election is unfolding? Then alter your behavior in a way that puts you outside the predictive nature of the models that are forecasting your loss. In the same way as Kasparov tried to outsmart Deep Blue by playing a very obscure opening gambit in their second game of their three match series, this same idea could be applied to political campaigns. This is the next frontier of data science – we are slowly moving from “prediction engines” to what will start to call “persuasion engines” – or what others with a more skeptical perspective might call “manipulation engines”.
We can think of prediction as holding up a high resolution mirror to the current world — it allows you to see where things are and where they are likely to go assuming no major changes — and persuasion/manipulation as understanding what happens to the landscape if you were to shatter that mirror. Because if you don’t like what the predictive models are telling you, the first thing you are going to want to do is to change your behavior. To accurately change the behavior of people you need to create models of the individual voters. Understanding what drives them, the influence of global media, personal narratives and social network drivers. This is the world of Agent Based Modeling, where the voters are represented as autonomous agents within massive computer simulations.
The 2012 election showed us the first hints of this persuasion engine starting to be used. While all the media attention was focused on Nate Silver and his predictions, Obama was quietly assembling a team of Data Science “rock stars” to run his campaign strategy. These statisticians were not there to predict the results (although they were very good at doing this too), they were employed to crunch data to change the course of the election — and they did this with a new algorithm they used to calculate a voters “persuasion score”. A score that tried to capture not just a voter’s current opinion, but how that individual opinion was likely to change after interactions with the campaign. It turns out that under the right circumstances, personal political beliefs can be quite malleable. Republicans can become Democrats, and Democrats can become Republicans.
Some of the early implementations of persuasion technology came into play on the night of the first debate. When the President stumbled on the night of the first debate and the predictive models swung back Romney’s favor, the race was once again tight and the Obama data science team was going to have the skills pushed to the limits. The next 48 hours they ran models and crunched numbers to figure out how to change the course of the election — how to change the minds of the voters. Which voters should they target, what version of the Obama story should they tell them, which friends would be the best person to deliver the message, how frequently did they need to be told. The data was plugged into their models and it started spitting out answers to these questions and values for these variables — new target groups and new messages to tell them. They found groups that the Romney team didn’t know existed and told them stories that they hadn’t heard before. Stories their models had told them would be the right narrative structure to create a shift in beliefs. They hit these group hard through Facebook, robo-calls and Romney simply had no way to respond — it was a move that knocked the predictive models out of the world they had been trained in. Confronted with a prediction they didn’t like, the Obama data science team moved into persuasion/manipulation mode and the course of the election changed.
These specialized tools will become increasingly mainstream over time. The team and I at Quid are working on improving strategic decision making tools in areas like this. We saw the first outputs from these tools last year during our monitoring of the Occupy Wall Street political movement. For the first time we could see the complexity of the landscape with all 30,000 different OWS stories mapped out in front of us, the different clusters representing different ideas coalescing and fragmenting in real time. We could see which ideas were dominating the conversation, the fracturing of the Occupy Oakland and Occupy New York movements and the rise of the concept of ‘inequality’ as a key meme driving the connection to the political discussion. Monitoring this landscape we could predict which clusters would start to dominate the conversation and we can also start to think about how to inject new ideas into the landscape. To exploit the white space between existing ideas in a way that could drive the conversations in different directions, merge two sets of ideas together or fragment one large group into many smaller ones. This technology has profound implications in both the political, military and of course the PR/marketing worlds.
The 2012 election will ultimately go down as the ‘prediction’ election (aka the Nate Silver election), 2016 may well be the first ‘persuasion’ election. An election where models accurately predict the likely outcomes of different strategic decisions for both parties in real-time. Where instead of trying to predict which voters will likely respond favorably to your existing message, you instead find voters that don’t like what you are saying and use targeted algorithms to create custom narratives that will change their voting preferences. But of course one side will only have this technology for so long — eventually both sides will have this technology and we move into the world of competition between algorithms — stay tuned.
* Which is not to diminish in any way the technical achievements of Nate’s modeling — think of it like the iPhone, amazing technology that through ubiquity has now been rendered mundane.
** Close elections will be predictable in their ‘likelihood’ of a result. This does not mean that the model will precisely tell you which candidate will win — rather the model will tell you the probability of the candidate winning. Thus even a 93% certain result will still lose 7% of the time.