Watching the EU Referendum results coming through on television, most of the focus was on individual declarations by each local authority area. The Populus analytics team, led by James Kanagasooriam, developed a tool to help clients get a clear picture of the overall national result earlier than media projections. Our model predicted that Leave was likely to win by around 2am.
For the most part, the results of the EU Referendum were presented as a series of declarations by individual local authority area. Because of our Election Night model built by the Populus data analytics team, our clients – including a leading hedge fund –were able to plug in each declaration as it was made and view its impact on the likely result. Here’s how we built it.
Our starting point was to try and estimate how Eurosceptic or Europhile every local area was. There were quite a few proxies for this based on actual behaviour rather than the answers people gave pollsters. For instance, we looked at how well UKIP performed in the area in the 2014 European Elections, and, where they stood, how well they did in the General Election the following year. Both were useful indicators but by themselves they weren’t enough to produce reliable estimates of the likely level of support for Leave (and by definition Remain) for each local authority. First of all, the turnout at the 2014 European Elections had been extremely low, about 25 points less than at the 2015 General Election and at that General Election some of the local variation in the UKIP vote was driven by local factors such as tactical voting, how marginal the Parliamentary seat was and even the known views of the local Conservative candidate on a seat-by-seat basis. So as well as looking at how UKIP had done on an area by area basis we looked at the detailed demographics of every local area in the country: its ethnic and religious mix, its levels of health and education, the make-up of its households and how its residents expressed their national identity. Using all of this information, from election results to census returns, we created an algorithm to identify the principal component or overarching factor which predicted most efficiently how the Leave (and therefore the Remain) vote was likely to vary from one local authority to the next.
Variables used to calculate component
% Christian (Religion)
% White (Ethnicity)
% Fair Health (Health)
% Not Professional (Employment)
% Level 1 Education / (Level 4 Education + Level 1 Education)
% Co-habiting with children (Marital Status)
% English only (National identity) 
% 2014 UKIP vote share in European elections
The graph shows the statistical relationship between each of these variables and the overarching factor (principal component) predicting the variation in the Leave vote by local authority area. The one below shows the factor scores for all local authority areas in England, Wales and Scotland (Northern Ireland reported by parliamentary constituency).
As the actual results came in one-by-one we were able to compare this with the predicted result. At the start this was inferred by a council’s factor score if the overall result were 50:50. When Leave or Remain exceeded its predicted vote share in any given area, the overall outcome estimate moved accordingly and with it the predictions for the rest of the councils which had yet to declare. In this way we had a dynamic picture in which real results data replaced and refined the modelled data as the evening progressed.
Download the Appendix of Component Scores