Pollsters are skulking in the wake of the US election, but one agency crowing is Havas Cognitive, which said it had foreseen Hillary Clinton’s defeat as part of data analysis it conducted for ITV's political coverage.
The agency told Campaign before the election on 7 November that its system would be appearing on ITV’s coverage, but neglected to mention any prediction.
So should we be skeptical? Definitely. However, the Havas team has fielded several of our questions and, taking them at their word, it appears they are on to something.
In a nutshell: the calls they made from their data were not 100% right, but what’s most interesting is that they could offer a credible alternative to traditional methods of polling.
The burning question first though – why didn’t they go public with a prediction?
The answer is that ITV didn’t commission the system to make a prediction about the outcome, but to offer insight into what US voters were thinking and feeling about different issues and candidates. The voting intention data was a sideline, and even when the Havas team felt confident enough to be want to talk about it on air, it was precluded from doing so by ITV’s arrangement with a polling organisation.
Fair enough. But when Havas say "the data kept predicting a Trump victory", what exactly do they mean?
At this point it’s worth introducing Joe Harrod and John St Leger from Havas Cognitive, which is a new practice built around the group’s 21-year-old partnership with IBM. Havas was an early licensee of IBM’s artificial intelligence capability Watson, which is one of the components the pair used to build the bespoke EagleAi system for ITV.
Harrod, the EMEA lead for Havas Cognitive, explains: "EagleAI was analysing over a billion social posts. We narrowed it down to voters who had expressed a clear voting intent one way or the other. The output was a map of voting intent – which showed a Trump win."
Interestingly, the map did not change significantly over time, according to Harrod. It was based on historic data going back to July (when the candidates were formally nominated). There were daily updates to the data, up to the morning of the election, but they did not make much difference.
"To be quite honest we didn’t want to believe we were more accurate than the pollsters, but it was obvious by the time Florida was called."
"We called Iowa, Wisconsin, Florida and Ohio [correctly]," says St Leger, the lead developer for Havas Cognitive.
The big deal about doing that is those four states, plus Virginia [which EagleAi wrongly called for Trump], are disproportionately important to the presidential race.
Under the US system the candidate who gets the most votes in a state is awarded all of its electoral college votes. More populous states have a greater number of electoral college votes and whoever gets at least 270 of them wins.
EagleAi’s record was far from perfect, with wrong calls on nine out of 50 states.
Six of those were tight but there were three clangers: EagleAi called New Jersey for Trump (Clinton won by 55% to 41.8%), Utah for Clinton (Trump won by 46.8% to 24.8%) and North Dakota for Clinton (Trump won by 64.1% to 27.8%).
"The system is miles from being infallible! But we built this in less than a month and we weren’t focused on prediction," says Harrod. "With more time it could be substantially improved."
But even with those wrong calls, EagleAi was right about the overall result of a Trump win.
It was wrong, but crucially it was wrong in the opposite direction from the polls forecasting a Clinton victory. Its predictions translated to Trump winning 338 electoral college votes when in the event he only won 290.
The team is now bullish about the potential for the system, both next time around in the US and in marketing scenarios.
"We would never have been confident enough to call it a prediction or make a pronouncement based on comparison to pollsters," St Leger says. "Next time, we would take the risk."
Within the marketing world EagleAi suits "every conceivable field of business, every campaign, every new product cycle", says Harrod.
"Cognitive enables us to understand what really matters to people, and when they actually need something from a brand. It's going to completely re-shape marketing in the next few years.
"As an example, I only start looking for a new car every time my wife gets pregnant – the rest of the time I couldn't give a monkeys for cars, car ads, car offers. Cognitive systems ensure that advertisers can support me when I need a bigger, uglier and more practical car – and save brands from wasting money on me the other 99% of the time."
And finally, no, they didn’t have a flutter on Trump.
"John is not a gambler but was saying that we should. I was on the fence like a wimp," says Harrod.
What type of system was EagleAI built on?
Joe Harrod: It's a bespoke Havas Cognitive build. We used GNIP for historical social data, and IBM Watson services for intent, key concepts and media sentiment. Voting intent was based on public social feeds and Natural Language Classifiers which we trained and refined.
What refinement did you need to apply to the results that the system put out?
John St Leger: First of all we made exclusions before the data came to avoid users that were tweeting excessively or were employing bots. Alongside this we ran our intent classifiers to catch people with real voting intent for instance... "I'm voting ____ because" "I just voted for ____" "#imwithher"etc. We trained the Natural Language Classifiers with terms like these.
Some people would say you can feed a lot of information into a machine, but not everything that affects how people vote can be measured. Are you saying you got enough information to make an accurate prediction?
Harrod: The short answer is yes. If you're looking at people who are explicitly saying "I'm gonna vote this way", then you can predict very clearly which way they are going to vote and that is still a massive and diverse sample from which to draw motivations and other insights.
Did you avoid your input being influenced or 'contaminated' by media coverage of what the polls were saying? If so, how?
St Leger: By screening prolific profiles we effectively didn't count large organisations and party boosters. However there's definitely a margin for error here. AI is all about probability and we didn't trust any statement of intent below 0.6 probability.
Are you aware of anyone else having used AI in this election? For example, the Trump or Clinton campaigns, or the bookmakers?
St Leger: This election has been clouded by the idea of bots being used by candidates on both sides - spammers and bots were something we had to look out for and they are a form of simple AI.
Harrod: Bookmakers and pollsters use very clever analytical models - none of them have publicised the fact that they're using AI. We know Sanjiv Rai called it right again.