Month: November 2016

Data-driven politics

Both the EU referendum in the UK and the presidential election in the USA have generated a lot of debate over what influenced the results. They were close campaigns. There are many things that could have led to a different outcome. I’ve been particularly interested in the debate over the role played by technology, the web and data.

I think the debate is missing how politics risks becoming driven by data rather than informed by it.

Technology-driven progress, globalisation, fake news, social media, malicious and mischievous actors

Technology is a major strand in the debate about globalisation, nation states, jobs and inequality. The web and data are at their best when they are world-wide, open and know no boundaries but it is essential that we use technology-driven progress to build a better society for everyone.

Technology and the web play a big part in the increased consumption of news online and on social media rather than through more traditional media channels and in particular the changing economics of media and the rise of fake news.

Technology, the web and data are also present in the investigations into the potential role played by organisations and people that may be malicious, for example foreign governments, or simply mischievous. In the UK a report claimed that 1/3 of the tweets on the EU referendum in a one week period were created by bots.

Whilst debate about hacking and bots continues in other countries, such as Germany, this story seems to be at risk of slipping off the radar in the UK and USA. A more informed debate about their effects and purpose would seem useful.

But there’s a fourth element that I’m barely seeing debated at all. Data-driven politics.

Data-driven politics

Politics has always gathered and used data to help it make decisions. This data comes from door knocking, censuses, opinion polls, focus groups and election results. In our current age of data abundance there are ever more and cheaper ways for anyone to gather and use data. Some of the uses by political parties seem to be at risk of copying the worst excesses of online marketing.

In the UK Labour leadership contest in 2016 organisations such as Momentum and Saving Labour talked of capturing email addresses and the reach of their social media channels. Neither group has been open about who is in control of this data, whether it is secure from hacking or how it is used.

Following the UK’s referendum on the EU one of the Leave organisations, Vote.Leave, talked of its superior use of data and how it was used for targeted advertising. The BBC reported that:

“Their dream was of a system that could put information from Twitter, canvassing, polls, websites, apps, into one giant IT programme that would then churn out extremely sophisticated models that would reveal the areas most likely to vote Leave, down to the street.”

Other campaigns and political parties debunked their claims on twitter and proudly said their data tools collected more information. No one questioned whether either was appropriate or healthy for democracy.

The New Statesman reported on the plans of a UKIP funder to start a new political party saying he claimed that Leave.EU’s email database was “a goldmine to anyone doing digital campaigning”. No one asked if it was either legal or right to transfer this “goldmine” to a new political party.

In America the Trump campaign was talking about its heavy use of data before the campaign finished. One insider on the data team said:

“There’s really not that much of a difference between politics and regular marketing.”

I hope I’m not alone in thinking there should be a difference between politics and marketing.

The Trump team used Facebook to target particular adverts to discourage black Americans from voting. Following Trump’s victory there are reports of one of the major data analytics firms being employed on an ongoing basis by both the government and an ongoing Trump campaign organisation.

This increase in the use of data to both listen to and influence people in political debates raises a number of issues.

There are biases in data and in how we use data

Data has biases. This might occur because there are gaps in how we collect data: for example ~10–20% of the UK and US population are not online because of issues such as cost, disability, location or motivation. Data also includes the biases in society such as those affecting gender and race. Bias can also occur through the people who decide how to analyse data and code the algorithms. People write code and people are biased.

If our political parties increasingly use the web and data to get them over the electoral winning line then they are likely to focus their efforts on winning over groups that are well represented in the data and predictable by the algorithms. Other people may be ignored.

National slogans, targeted adverts

The recent campaigns hint at a trend towards very broad brush national slogans (Make America Great Again!, Take Back Control) coupled with targeted campaigns aimed at particular interest groups.

Someone working in the car industry living in the Northeast of England might see an advert telling them that a political party is supporting car factories in Sunderland but see nothing else about that party’s policies or beliefs. The political party can see how that person responds to the advert — whether they comment, share, like, or retweet it — and use that data to tailor their next advert.

Some of these campaigns will come through official channels but targeted campaigns will come from through social media adverts, local (sub-brand in marketing speak) or unoffical channels. Coupled with the ongoing loss of funding for and trust in national journalism this will make it ever more difficult for a coherent national debate where a society makes an informed choice about its future. Instead political parties will tell different groups of people what they think they want to hear based on data.

We risk becoming more fragmented and the importance of values and principles in politics could become ever weaker as politics becomes more data driven.

Now some of this type of political advertising occurs already but technology, the web and data allow it to happen at a larger scale and at a cheaper cost. I can only imagine the voices in political campaigns saying that this is a race and that the process must become faster and more automated through “smart” algorithms. As we have already seen in other sectors these algorithms risk embodying and multiplying the biases in the data.

Use of data in political campaigns will influence how politicians govern when in office

Finally, there is an ongoing debate about the use of data by governments and the private sector. This debate concerns the rights and responsibilities that people and organisations have when data is collected and used. There are calls for greater control by people and more scrutiny by regulators.

This debate needs to include political parties.

If our political parties believe that the only way to get elected is through the use of data and algorithms then they will use them. If that use is not questioned and people are not held to account then that use could be normalised. Politicians might carry those normalised beliefs into office and it risks affecting how they govern and how they legislate.

Data-driven politics

Politics can be improved by new technology, the web and data.

The web offers ways for more people to be engaged in politics and it gives them more tools to influence politics. The web can help with a transfer of power from the centre to communities and people. Data can provide better evidence for policies and make it possible for us to trial new policies before they are implemented on a large scale and at a big cost. Better use of data can help improve public services and the economy.

These things can be dazzling. But we need to recognise the risks. Not just that some technology innovation is pointless but also that some uses of technology are actively harmful. That they can harm individuals and communities and that copied wholesale into politics they can damage democracy.

Rather than being driven by data we need to encourage politics to be informed by data, to be open about how it uses data and for political parties to use data and technology to help people engage with politics and make better decisions based on both evidence and their values and principles. It’s up to all of us, particularly those of us with knowledge of technology and data, to help make sure that this happens.

Automated cars and data

Everyone’s talking about automated cars and how they will make it cheaper and easier for us to get from place to place. As well as helping us travel they will change our cities by freeing up space, save lives by reducing the number of driving accidents and lead to the loss of millions of driving jobs with the associated impact on people and communities.

If you’re reading this I bet you’ve heard this talk. If you live in one of the test areas in America, UK, China and etcetera you may even have seen trials. There are skeptics, but I think that people will be able to gradually build ever safer and more automated cars. Once they do many people will choose to use them. Change is coming.

More Autonomous Cars Coming to Public Roads in 2016 Copyright © 2016 ENGINEERING.com

Making it easier and cheaper to move around, changing cities, saving lives, removing a type of job are complex things. There are many more secondary effects. Our policymakers need to consider the risks and benefits to help us get to a better society that includes automated cars and benefits everyone.

But I’m not seeing enough discussion of one important aspect of automated cars: data, and how security, privacy and openness can increase its impact.

Automated cars collect a lot of data

As well as transporting people and parcels automated cars will collect vast amounts of data. A human driver needs to look around to see street signs, the weather or cyclists. Similarly automated cars will need to collect data to make driving decisions.

http://dataconomy.com/how-data-science-is-driving-the-driverless-car/

Automated cars collect a lot of data. A PhD student recently calculated that a modern car already generates 25Gb of data an hour. In 2013 it was reported that Google’s automated car generates 750Mb of data a second. Earlier this year Comma.ai, a company that was working on automated cars released 80Gb of data generated during 7 1/4 hours of driving.

This data includes such things as the car location, maps and video footage of the surrounding area, information about nearby traffic, accidents, weather information, the route of the car and information about any passengers or parcels that that are in the car.

That’s a lot of data, how do we get most value from it?

Security and privacy

The security of this data clearly needs to be considered. We need to protect the data collected by the car and the data that the car needs to be able to get to do its job. Car hacking is a real risk whilst an automated car is likely to be more dependent on access to data than a car driven by a human. Data is already an under-recognised piece of critical national infrastructure, automated cars will only increase the need to strengthen it.

Silly Wired. Nexar, like any camera, isn’t just collecting your data it’s also collecting data about other people.

Privacy will also be an important consideration. If automated cars mishandle personal data about the people travelling in them or the people and things seen by their video cameras then some people will be damaged while other people may lose trust and choose not to use the cars.

Some of these issues will be explored by smartphone apps, like Nexar, that use the smartphone’s camera and microphone to collect data about car drivers, passengers, pedestrians and other cars.

But automated cars will collect far more data than a smartphone camera.

Automated cars will use data collected by other cars and people

Automated car manufacturers and policymakers should be thinking about security by design, privacy by design and how openness can help build the trust that will be needed to get the most impact from automated cars. Open can help in other ways too.

The data collected by cars is needed for them to do their job but automated cars will also use data provided by other things and people.

Automated cars won’t be like a starting character in the Civilization games. They’ll be able to see the full map. Civilization made by Firaxis Games, image from VentureBeat

An automated car will not wake up in a factory, blearily blink its headlights and then discover the world like a video game player constantly surprised by new things. The car will have a reasonably accurate map of the world, will get weather data (what sensible car would choose to drive into a hailstorm that might damage its paintwork?) and be able to share data with other cars.

Just as we hear of traffic jams from other people via radio alerts or smartphone apps like Waze, the people designing and building automated cars have planned for them to be able to share news about traffic congestion or improvements to their basic maps. Those improvements are vital because map data, just like any other data, is not always 100% accurate. Things change. An automated Google car driving down a street might discover that a road is blocked off, by sharing this with other Google cars it can make Google’s service more efficient.

This all sounds like good use of data, but it’s not good enough. We can and should do better.

Data should be as open as possible while respecting privacy

Werner Herzog’s first automated car looked a lot like a boat.

Sebastian Thrun of Stanford says in Werner Herzog’s new documentaryWhenever a self-driving car makes a mistake, automatically all the other cars know about it, including future unborn cars.”. But isn’t the only way that all self driving cars will ‘automatically’ know of all other mistakes is if the data that describes those mistakes is available beyond just the automated cars of one manufacturer?

At the Open Data Institute we think that we get the most value from data when it as open as possible while respecting privacy.

The team at OpenStreetMap’s 2016 April Fool’s spoof was a plan to launch their own automated car. They said: “our self-driving car breaks new ground by automatically correcting OpenStreetMap data based on your driving behaviour”. The story was a spoof but this bit — regardless of whether it’s OpenStreetMap or another mapping organisation/community relevant to a particular country or city — is one of the ways that cars can share data with each other and with other people.

Mapping is a shared problem. All cars, automated or not, will benefit from better maps. As will pedestrians, cyclists, local authorities planning new infrastructure investments, etcetera. Collaboratively maintaining open mapping data between all of these people can reduce costs and improve quality. Facebook are happy to collaboratively maintain open mapping data as they recognise the value in this approach. Automated car manufacturers, mapping organisations and policymakers should be too.

Reducing accidents is another shared problem. The machine learning algorithms that will drive automated cars will learn faster and more accurately from more data. Sharing detailed data about the conditions in place when an accident occurred will save lives.

People will ask an automated car to drop them off at an address. That address may not be in the current list of addresses — perhaps it’s a new flat? — so the person may teach the automated car where it is. The address could then be sent to an open address register as a potential improvement to the data. The next automated car will know about it but addresses are vital for many other things from pizza delivery to an ambulance. We should be maintaining addresses as efficiently and openly as possible. Collaborative maintenance helps with that and openness means that anyone can use it.

There will be many other types of data collected by the car that when opened up in this way will improve transport services, save lives and make things better in other sectors.

Live weather conditions (something that the lovely folk at TransportAPI are working on). Air quality. Congestion data. Aggregated movement of people around a city. Etcetera.

This impact of opening up this data will be felt not just in better automated car services but in other services and sectors that use the same datasets. Automated car manufacturers are in the transport business, not the mapping or air quality business. Publishing the data openly will help them tackle shared problems and increase the impact of the data. Everyone benefits from better and more open data.

Automated car data should be secure, private and open by design

The Open Data Institute’s data spectrum. The most important things about data is who can access and use it. Mapping an automated car’s data against this data spectrum would be very interesting.

The transport sector has long been a leader in open data. The countries and organisations that have taken the lead in opening up this data have benefitted both from better services for people and through the creation of innovative new services like GoogleMaps and companies like CityMapper, Transport API and ITOWorld that create jobs and help get the data used.

As that seemingly inevitable next wave of change occurs with the rollout of automated cars that will improve transport, free up space, save lives by reducing accidents and impact jobs let’s make sure we don’t forget about the data infrastructure that is necessary for those cars do their job and can create so much value for the rest of our society.

Making that data infrastructure secure, private and open by design will benefit everyone.

If you want to chat about the thoughts in this blog then tweet or mail me.




Hacker Noon is how hackers start their afternoons. We’re a part of the @AMI family. We are now accepting submissions and happy to discuss advertising & sponsorship opportunities.

If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!


Words for the launch of the APPG on data analytics

These are the approximate words I said at the launch of the new All-Party Parliamentary Group (APPG) on data analytics on 31 October. An APPG brings together representatives from different political parties from both the House of Commons and House of Lords to pursue a particular topic or interest. Daniel Zeichner MP’s speech from the launch is also online. Other speakers were from TfL, Experian, CompareTheMarket and the Institute for Environmental Analytics. In person I wandered off topic a bit based on audience reactions but I promise that there were no cat jokes.

Hi, thank you to everyone who’s come along and for inviting us to speak. I work at the Open Data Institute, or ODI as it’s more commonly known. The ODI’s mission is to connect, equip and inspire people around the world to innovate with data.

It is based in London but the network is global. We have nodes and members on six continents and in every nation of the UK. We do research, train people, advise them, introduce them to people with similar interests, give them simple tools to help them publish and use data, incubate startups and encourage thinking on fundamental issues such as data infrastructure and how to use personal data in a way that creates trust. We do this with large businesses, startups, charities and governments. We are a global voice for the better use of data to deliver social, environmental and economic impact.

The ODI is a not-for-profit and was founded five years ago by Tim Berners-Lee and Nigel Shadbolt. Both of them are at the yearly ODI summit which takes place at the British Film Institute tomorrow.

Bringing people together to solve common problems

The ODI team at the 2015 summit. Don’t let anyone convince you that diversity in tech is impossible, it’s not. Image by Paul Clarke, CC-BY-SA.

The summit is kind of unique, as is the ODI. It brings together large corporates with charities and startups; people interested in global development and democracy with people interested in the latest smart cities and transport trends; people from local government, national government and reps from global institutions. The attendees and speakers come from around the world. They all believe that openness and data can benefit them and everyone else too. (you can watch a stream of many of the summit sessions)

Which brings me to this all-party parliamentary group on data analytics. I’m a big fan of democracy and I’m also a big fan of things that bring together people from different backgrounds such as elected representatives and peers from across the political spectrum to find common points of interest, or problems, where people can work together to get things done and make things better. It’s the type of approach we use to help bring together large sectors like banking and agriculture, another one will be announced tomorrow. I won’t spoil the surprise. (it was sports)

An age of data abundance

We are in an age of data abundance with billions more people and devices coming online. It’s ever cheaper to collect, use and publish data. A web of data is evolving that sits alongside and behind the web of documents which changed our lives when Tim Berners-Lee invented the web 20-odd years ago. Our experience from the last 5 years is that that data will create most value when it is as open as possible while respecting privacy: an open future. But the future is uncertain.

We need to work together to shape an open future because whilst the current wave of technology change has bought many benefits it also carries many risks. Privacy risks, monopoly risks, democratic risks. We need to overcome those risks and project a positive message to get to a good future.

Tim famously said “this is for everyone” when tweeting about the world wide web from the launch of the London Olympics in 2012. The type of open thinking that Tim showed when he gave away the web is going to be necessary if we are going to realise the brilliant potential of this new web of data to benefit everyone.

And that open thinking is what we hope to see from this all-party parliamentary group. As well as the rest of us we need government and legislators to play an active part in making this happen. Government can lead by example.

Data for everyone

We can benefit everyone if we build data infrastructure (vital reference datasets like maps, lists of local authorities and addresses, and tools, processes, policy, legislation, organisations) which is reliable, adaptable, trustworthy, and as open as possible. Open in the sense of culture as well as open data.

We need to provide data skills for citizens, business and policymakers, with policymakers using data both for evidence and as a tool to achieve their policy ends.

And we need to encourage open innovation. A bridge between academic research, public, private and third sectors, and a thriving startup ecosystem where new ideas and approaches can grow. Innovation that solves problems.

We describe this as the open future. A future where we’ve understood and tackled those risks, made data as open as possible and created benefits for citizens, businesses and government. Data for everyone.

There were questions

After we talked the audience asked questions covering a whole range of topics from data in manufacturing and engineering; trust in use of data; public sector reform; EU proposals for copyright and how that impacted on organisations holding data; and whether people should be paid when their data is used. A wide range, as you’d expect from something that connects together and underpins sectors across the economy.

The last two questions I found particularly interesting. Both of them seemed to come from applying models from the real world to something, data, which has different qualities. Data is non-rivalrous, it benefits from network effects, etcetera. That’s why the economics of data are different from other things and still being researched. The questions also seemed to come from an implicit assumption that we could use the concept of ownership in the physical sense of the word. We need to be careful in how we use the language of ownership to address questions about data. Physical world metaphors don’t readily fit the data world. And even our understandings and expectations of ownership in the physical world aren’t as simple as they seem. This blog from Ellen Broad is a good read and what I channeled in my response. I hope the APPG thinks about those questions and the concept of ‘data ownership’ deeply. Its members will be part of shaping the legislative environment that will help us get to that open future.

© 2020

Theme by Anders NorenUp ↑