London Office of Data Analytics Pilot: two weeks of showing and telling to focus the data science and sharpen the overall approach

It’s been 6 weeks since our kickoff workshop for the London Office of Data Analytics (LODA) pilot programme, a joint venture between the GLA and Nesta, with involvement from nearly half of the London boroughs.  The broader context shows a real sense of growing intent and purpose around data sharing for impact.  In the same week as our latest LODA meeting – a show and tell session that we report on here – London Councils announced a deal for CIPFA (a public sector accountancy agency) and BAE Systems to launch a data analytics driven counter-fraud hub.


But back to our own exercise, the straightforward goals are to find actionable insights that save money on public services, and in the process show that joining up data from multiple boroughs can lead to solutions benefitting Londoners that wouldn’t be possible otherwise. The consensus view is that unlicensed HMOs (houses of multiple occupation) is an issue that is both important to the boroughs and well suited to a data-driven approach that will lead to those much desired practical, identifiable outcomes. The task of working out a detailed approach to the problem has now begun.


This has involved Nesta, the GLA and the ASI data science team (who will be performing the analysis for the LODA pilot) meeting with Boroughs who have been sharing with us how they currently find unlicensed HMOs. As expected, this is not a simple problem.  Methods vary, as does the data available; the interpretation placed on top of a base level of licensing also differs borough to borough and consequently so do the types of HMOs that are licensed in each borough.  What is also clear is that in those Boroughs we spoke to, there is a recognition that more can be done to increase the identification of HMOs, and that this will drive a series of policy and business process related outcomes.


So where do we take this?  Unsurprisingly, excellent work is already being done across the city on this topic. This gives us a nice problem. Recognising the value in scaling up existing activities, and indeed some existing collaborative working, a quick win is to establish ways in which the better sharing of data and information and can lead to the better targeting of unlicensed HMOs and rogue landlords operating across borough boundaries. Beyond this, a couple of approaches have been identified which are shaping the next stages of the data science exercise.  These are:

  • Building preliminary predictive model(s) for HMOs needing a licence under the mandatory government scheme, with data that is currently being used by one of the Boroughs for making predictions about problematic dwellings. The aim here is feature selection: using machine learning to understand the predictive power of each type of data and allowing us to hit the mute button on data in the model which are of very low predictive value.
  • Working with all of the participating Boroughs to share and join up the data that is found to be most diagnostic of HMOs needing a licence- with the caveat that a dataset is only useful for prediction if all or most of the boroughs hold it.
  • Building on existing successes of the Boroughs in mapping the holdings of rogue landlords, with the intention of incorporating this into the predictive model for HMOs in the future.

This is just a starting point, but after further iterations we hope to offer a new model of targeting inspections based on sharper identification of likely HMOs. This can also be fed into longer term goals such as establishing the extent to which HMO data can be used to deliver other interventions in allied areas of policy like public health and anti-social behaviour.


In the first of a series of show and tell forums held at City Hall this week, the ASI data science team shared their initial understanding of the problem to representatives of interested Boroughs and outlined the steps above that they plan to take to find solutions to these challenges. This skill sharing has been mentioned previously as a key aspect of this pilot and is just as critical to a project like this as the number crunching and analytical work that will take place down the line.


What stood out is that this is an ongoing, evolving group effort, in which a key watchword will be ”adaptability”. As the data science team learns from those providing frontline services and tailors the analysis based on feedback about what is going to be the most useful and actionable for the most boroughs, so the learning goes the other way as well. Feedback on what data sources are the most valuable also needs to be taken into account by service providers in the future when they decide what information to record.  We are now cementing this agile approach to be used in the pilot over the coming weeks so that this quick back and forth (aka ‘sprints’ in the software development world) will bear fruit, and ensure that the work done is genuinely useful for all involved. Watch this space for more updates as the pilot programme continues.