Author: peterkwells (Page 6 of 11)

The depth of critical thinking

2017-07-18 / peterkwells / 0 Comments

A picture of a Charles Rennie Mackintoch chair by Chris 73 / Wikimedia Commons, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=857489

I made a bad joke at work recently. This isn’t necessarily unusual. The reaction to this bad joke made me think a bit more than normal though.

While reviewing some research on business models I observed that most of the models were predicated on the need to increase trust between businesses and their customers.

I wondered out loud if trust was in danger of becoming the next big over-used word and idly mused that we should get ahead of the game, joking that we should think about post-trust business models.

Unfortunately I was both believed and misheard. I was believed because sometimes I sound convincing — well, I am a middle-aged white man with a beard and a convincing poker face…—I was misheard because someone thought I said post-truth business models and, without me realising it, started researching that topic.

“Post-truth” was last year’s word of the year. It even has its own wikipedia page. The Oxford dictionary describes it as:

an adjective defined as ‘relating to or denoting circumstances in which objective facts are less influential in shaping public opinion than appeals to emotion and personal belief’.

We often talk about post-truth at the Open Data Institute. We work with data after all. People ask our opinions on it. Some people tell us that better data and more facts is the answer to the challenge of “post-truth politics”. They ask us to imagine a world where someone reading a newspaper story can click on a fact to find out who produced it. And then click on the name of the fact producer to find out who funds them. And then click on the funder of the fact producer to understand their motives. This will soon cut down on those pesky emotions and bring facts back to their position of influence.

Unfortunately, there are problems with that vision.

Why and how will people click on a fact and what will they do next? We need to make it interesting for people to want to know more, to want to dive down beneath the story into the world beneath it. We need to make sure that the world beneath the story is present and linked together. We need to give people the critical thinking skills to navigate that world.

But even that risks not being enough. If you don’t believe me ask any philosophy student. One of their early courses will be on epistomology, the study of knowledge. They might be asked whether they can prove that the chair that they are sitting on is actually a chair. The students will quickly learn that for centuries, if not millenia, philosophers have been playing around with this and similar propositions.

A brain in a vat By made by Alexander Wivel [1]. — knol article, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=8719730

The student will be asked to prove that they can actually sense the world and experience the chair rather than it being a trick being played on them by a Cartesian demon, some controlling their brain in a vat, or — heaven forbid — someone about to be tortured by Roko’s basilisk for failing to bring about the AI singularity.

The students will soon realise that the concept of a chair can mean different things to different people and get taught that many languages and cultures don’t differentiate between blue and green. They will put up countless facts about chairs and a good philosophy lecturer will knock them all down. Minds get blown in epistomology courses.

At the end of a bewildering course the philosophy lecturer might ask their students to vote on whether they have managed to prove that their chair is a chair. Some hands will go up for no, some for yes, others might waver a bit. When my own epistomology course got to that point the lecturer held a vote and then started laughing. “Does it matter?”, he said, “is it a comfortable chair and does it stop your bum from hitting the ground? Yes? Then it’s a flipping chair.”

You see the world is already complex enough and humans can decide to make it even more complex by diving into all the facts to try to empirically prove everything. Some of us love to do that and there are times when it is both fun and important to lose ourselves in a sea of facts and data to see what we learn. There are great things out there waiting to be discovered.

But in our daily lives we often need to dive just deep enough. To not submerge ourselves in the full sea but instead to simply go to a reasonable level and form an idea that we can test. We can then hold that conclusion up to scrutiny. Perhaps by sharing it with a range of other people so that we can learn from their responses or by doing a simple experiment (did bum hit ground? No? Probably chair).

This can need some fearlessness, we have to be open to being wrong, but forming and testing ideas can often be a quicker path to a decent truth than all of the facts and data in the world. It might help stop some myths and falsehoods lasting for longer than they need to too.

Oh, and the person researching post-truth business models? They came up to me a few hours later to share what they’d learnt. I shamefully admitted my bad joke, profusely apologised for their wasted time and praised them for testing their ideas sooner rather than later…

An example I use when talking about data and services

2017-07-11 / peterkwells / 0 Comments

In my job at the Open Data Institute I sometimes talk with people, from businesses and governments, about how better use of data can help them design and deliver better services. I’ve been using a public sector example recently that I’ve not written down. Here it is.

Ways to get bus timetable data to people who need it

The example I use is bus timetables. People need to know the times and routes of buses so they can make a journey and get to their destination. When I use the example I talk through four of the patterns that can be seen in many cities and towns around the world for services that get bus timetable data to people who need it.

Mass market private sector services: many cities and towns now have bus timetables available as open data. Private sector services like Google Maps, Apple Maps and CityMapper pick up this data and build it into a service which they aim at the mass market of smartphone users. The services work in many cities and might haveother features such as information about restaurants and pubs. They get their open bus timetable data either directly or through a data aggregator, like TransportAPI or ITOWorld, who collate data from multiple cities / transport providers. That takes aways some of the effort from using open data and makes it easier for more people to build services.
Targeted private/public sector services: smart cities and towns recognise that the mass market services don’t always meet all needs, particularly accessibility. If you look closely you can often find small bits of public services meeting the needs of some users, or a transport authority running a challenge to help focus the private sector market on meeting particular user needs. Left to its own devices the private sector might only target the profitable and easy-to-serve mass market, a challenge can help change that to build more accessible services or to experiment with new technologies like AI or voice interfaces. Targeted services often use the same data aggregators as the mass market services. It’s the same data, just presented for a different set of user needs.

A bus stop outside Picaddily Station in Manchester

3. LocalBusTimes: a local website and/or smartphone app where people can look up the timetables for a journey they want to make. It might be for a whole town or a single bus company. It probably started by only providing bus timetable data, nowadays I think more of them recommend a route. The local authority or bus company typically run the LocalBusTimes service themselves.

4. Physical services: not everyone has or uses a smartphone when they need bus timetable data. There are many reasons for this. To give just a few: there might be no coverage, they might not be able to afford a smartphone, they might have run out of credit/data, they might not want a smartphone, their city might not have made bus timetable data available or they might simply have run out of battery. That’s why bus stations have information desks, why bus stops have timetables printed and stuck to them and why people ask other people “when’s the next bus?” on the street. Someone has used the bus timetable data as part of the design for the bus stop or as part of designing an operational process to help a human answer another human’s questions.

Some of the reactions I get to my example

No one, yet…, has told me that my example is stupid or dull. Feel free to be first to do that.

When I talk through this example with people the usual reaction is that while lots of people knew about the transport sector and data few people had thought of all the patterns or wondered about how they could be applied to their work in another sector.

Most people had used the mass market services but very few people had thought of using the market, in this case through open data and challenges, to help them meet their own goals. Those that had thought that they risked losing control to the market and hadn’t realised that they could still discover if user needs were being met — for example through user research — and could use a variety of ways to shape the market to target unmet needs. Challenges are just one of the ways to do that. Governments can legislate. Both businesses and governments can use procurement, strike deals, make different types of data more open, either fully open or in a more controlled way through APIs, or lots of other forms of soft power to shape the market around them.

I also find that few people had thought of the physical services pattern as part of the overall service. I find that sad. It also shows that I’m in a bit of a bubble and exposed to only some views. The tech world is overly focussed on services that end in smartphones and websites. I expect/hope that’s a passing phase.

Why I’m writing this down now

I’m writing this down now because I’ve been using the example for a while. It’s good to publish it to get my thinking straight, to show some of the reactions I get and to learn from new reactions. As I often say, data is becoming infrastructure that will be as open as possible. Businesses and governemnts need to adapt to that future. They have different goals, and needs for democratic accountability, but can learn from and collaborate with each other. I’m expecting to do some more work on public sector service delivery models over the next few months. It’s good to share, even shoddy, thinking early. It’ll help make that work better.

Open your effing data

2017-07-02 / peterkwells / 0 Comments

Warning: this post contains content that will be offensive to some people.

The post is a version of talk I gave at the ODIFridays series of lectures at the HQ of the Open Data Institute in London. The slides and a video of the talk are at the end of the post. Like most of my talks I adlibbed a bit. The post has links to most of the material I adlibbed from, others are at the end of the slides. It includes some thoughts on swearwords, Roger Mellie, democracy, censorship, Blackpool FC, artificial intelligence, context and an apology to my mum.

One of the UK’s regulators, Ofcom, commissioned research on offensive language last year. The research got lots of headlines. It was a nice opportunity for papers and websites to make cheap gags about swear words.

A report from the Metro on the publication of the report.

But it also gave me an opportunity to open up some swear word data and to use that example to talk with people and think about things like democracy, censorship, context and artificial intelligence. I made some cheap gags about swear words too.

Data needs context

Ofcom published the research in an openly licensed 126-page document and a 15-page quick reference guide.

from the report that Ipsos Mori did for Ofcom

The newspapers extracted the data from the PDF to write their stories. I extracted the data too. (btw some work that our friends at ODI Leeds and Adobe are doing might make my cut and pasting easier in the future…)

Unfortunately at first I missed the all important context for the data. I discovered the mistake by checking my data with the helpful team at Ofcom.

Take a look at the data or if you want to use it in a project or service there’s a CSV in github.

After some discussion within the ODI and with Ofcom’s research team we ended up with this. The same data as the PDF but in a format that is both human and machine readable.

Now, a big part of our job at the Open Data Institute is “getting data to people who need it”. Normally I start with problems but this time I had started with data. My bad. Now to find out who needed it and how they would use it.

Some of the things people use this swear word data for

As I put the data out on twitter there was a background mantra of “arse…balls….knob…bastard…” from around the office. One person then wrote a little script that people could use to get their computers to say the list of words. Soon I could hear both human and machine voices swearing away. The swearing mantra was charming, if a little unsettling, but I had my serious face on. Why do people swear?

Well a bit of research showed an academic saying:

The main purpose of swearing is to express emotions, especially anger and frustration.

Seems fair. I suspect that a lot of people get frustrated at not being able to get data they need to do something. That explained the background mantra from the Open Data Institute office, but what about other uses of the data?

Roger Mellie, copyright Viz. Note that the swear word data might allow people to block his language, but not his gestures.

The content of the report told us about some other users. It would help TV broadcasters and presenters understand how people would react to things that they said on air and so help the presenters decide what they could say.

For example the word “bollocks” was seen as somewhat vulgar if it referred to testicles but less problematic if it was being used to call something ‘nonsense’.

This might mean that people did or did not say words in certain contexts. It might lead to some content only being accessible if a PIN was entered to unlock it.

This data was created because of democracy

Democratic processes can need data to be created. Image Nick Youngson, CC-BY-SA-3.0 via http://thebluediamondgallery.com/d/democracy.html

But the biggest user of the report is Ofcom themselves. Ofcom commissioned the research because through our democratic processes we have decided that there are limits to free speech on TV & radio and made it Ofcom’s job to regulate those limits. They needed the data to help with this job so Ofcom commissioned Ipsos MORI to produce the data by performing user research through focus groups, interviews and follow-ups based on a long list of potentially offensive words and phrases.

We have given Ofcom the power to fine organisations and people that breach their codes. By publishing the report openly, they were helping broadcasters understand how they might use those powers and therefore discouraging breaches. This probably makes the system cheaper and more effective.

Broadcasters are likely to have their own guidance to help them meet the expectations of their target audiences. They could merge Ofcom’s list with their own list to help them meet both society’s needs and their own user’s needs.

Similar data is maintained in contexts outside of TV and radio

In Britain Mary Whitehouse was a famous campaigner from the 1960s to the 1980s against things that she found offensive. I can imagine Mary being keen on data-driven censorship. Image fair use via Wikipedia.

The data includes the word ginger saying it is ‘mild language, generally of little concern’, but the word ginger can also be used to describe a very tasty type of biscuit. A filter that used the swear word data to block offensive words might ban ginger nuts. That would be bad. This is a common problem with simple data-driven solutions. They ignore context.

I couldn’t find a list of offensive biscuit names but there are other sets that are similar to the swear word data used in contexts other than TV and radio.

The UK has a list of suppressed car registration plates

It is the job of part of the UK government, the DVLA, to maintain a list of combinations of letters and numbers that you cannot put on a car. Unfortunately, and curiously, the list is not published openly, but sometimes it is made available after freedom of information requests.

An extract from the suppressed car registration plate list via Whatdotheyknow

The list of suppressed car registration plates helps prevent confusion over typographically similar symbols, like o (zero) and 0 (oh). It blocks language that is likely to be considered offensive, for example “*B** UMS” and “*R**APE**”.

The list also explicitly contains the names of terrorist groups such as the UVF, UDA and UFF. Another terrorist organisation, the IRA, are already banned, like any other organisation beginning with I, because of the potential for confusion between 1 (one) and I (aye).

More controversially the acronym for the far-right British National Party, BNP, is also on the list. The BNP are allowed to stand in the UK’s democratic election process. How was that decision made? Unfortunately just as the list isn’t publicly available neither is the methodology.

Context affects what words are offensive

The UK’s democratic processes produce others lists of offensive words.

The speaker in the UK’s parliament can request that politicians withdraw words when debating with their opponents, so called unparliamentary language. The way in which words are deemed to be unparliamentary or not are unclear. In 2015 the opposition leader Ed Milliband was allowed to call the then Prime Minister David Cameron “dodgy”, yet in 2016 an opposition backbencher Dennis Skinner was asked to leave a debate because he called David Cameron “dodgy Dave”. The word “dodgy” isn’t on Ofcom’s list, it’s offensive to call an MP “dodgy” in a parliamentary debate but not to call them it on television.

The list of unparliamentary langauge is currently unpublished. To help UK politicians make better decisions about being unparliamentary or not I compiled some examples into a list. Parliaments in other countries, and other UK nations, have similar lists. They show the importance of geographic context.

The Australian parliamentary records show offense was taken against the term “suck-holing”, a word that in 1977 was decided to be offensive in the Australian parliament but that will be meaningless to most British people and has never been used in the British parliament. I wonder if a British MP would get away with using it.

The word “Oyston” is offensive to me and my community of fans of Blackpool football club. The offensiveness is not only because of this cringeworthy picture but because of how the Oyston family treats fans.

Another example of offensive language in a particular context is the word “Oyston”.

The Oyston family own the football club that I support, Blackpool FC. Because of their actions against fans being called an Oyston fan on one of the websites used by Blackpool fans would be offensive. How would anyone outside of the community of Blackpool fans discover this?

There are related examples that may help us understand how we could do this.

Collaborative maintenance of data

Hatebase maintains a list of hate speech from around the world. The data is maintained by automated processes and manual interaction to cater for how hate speech changes over time and in different places. Hate speech can be used to encourage violence against people and communities. The collaborative maintenance process allows people to debate which words are hate speech or not.

“popular” types of hate speech from Hatebase.

An interesting experiment would be to see if the hatebase dataset could have helped predict violent events through rises of hate speech in parliaments, newspaper and social media. Do get in touch with them if you have money to fund that research.

Other people could learn from the example of Hatebase. If British politicians wanted, and could get to grips with github, then they could collaboratively maintain my initial list of unparliamentary language and create something that would help them understand the boundaries of offensiveness.

Offensiveness is affected by time, place and communities

By this point in my own research I was clear that the context of offensiveness is affected by time, place and communities.

When I checked I found that swearing philosophers were, of course, already aware of this. As often happens I was a technologist rediscovering ground that others had already covered. But technology can also affect how and which words become offensive.

People create new offensive words

Oyston is an example of a word that became offensive to a small group of people before becoming offensive to a larger group. Blackpool fans have effectively used social media and the press — oh, and talks & blogposts like this ;) — as part of a campaign to get the Oyston family out of our football club. An effect of this has been to spread the understanding of the offensiveness of the Oystons from the seaside to wider parts of the footballing community. A more famous example is the case of Rick Santorum who found his surname defined as an offensive word in a campaign led by Dan Savage.

This is a challenge to any list of swear words and a risk for people who use them. People create new offensive words for their own purposes. They game systems.

A t-shirt with the universally unique identifier for beef curtains.

Would people game the swear word data I created from Ofcom’s list? Yes, of course they would.

An example quickly came to mind. When I published the Ofcom offensive word list as open data then in line with good practice I gave every entry a universally unique identifier (UUID). UUIDs make it easier for machines to use the data.

If this data was to get widely used then how long would it be before people started to circumvent the system by being interviewed on telly wearing t-shirts with the UUID of a swear word? Perhaps over time the UUIDs, or parts of them, would become offensive? “That fella’s a right 81cb.“, they’d say. Maybe the UUIDs would need to be added to the list as they became offensive?

People adapt and change. That is one of the best things about people and one of the biggest challenges we face when maintaining and using data. We need to build in mechanisms to change datasets over time as needs and uses change.

Swear words-as-a-service is hard

It is clear that swear word data was easy to build and also clear that it would be more difficult to maintain and make it useful in multiple contexts.

I knew that many companies were already maintaining similar lists as, like many other people, I had seen, laughed and evaded filters on websites that had turned the British town of Scunthorpe into the apparently inoffensive “S***horpe” due to simplistic and bad data-driven algorithms. I do wonder how useful those filters and services are.

Many of the website filters I had seen are simple and flawed because of the lack of context and their inability to adapt to people’s changing behaviour but thinking ahead I wondered if people would start to apply machine learning / artificial intelligence (ML/AI) and create services that could automatically learn new swear words? Perhaps this could be used on a massive scale to reduce the damage caused by offensive language on the web?

I knew that I wouldn’t be the first person to think of this idea. While 2016 had been the year when every problem could be fixed with a blockchain, 2017 is the year of ML/AI.

A quick search of patent libraries showed that in 2015 Google had registered a patent to classify offensive words using machine learning. Unfortunately it looks rubbish. The training mechanism worked on a large set of text samples, it failed to recognise the context in which the text was being used. The resulting service might be slightly better than current filters but would still be data-driven rather than informed by data.

Maybe, like Hatebase, it would help if users were to train the machines that provided the service. After all Google, like most other large internet companies, use thousands of people — including you — to help train their services. I started to consider what I had learn about offensive language and think of the tasks that Google would need to give to swear word raters to train their machine:

Task: go to a football ground in Gdansk, Poland. Play this video to people near you. Observe their attitude to you, and each other, over the following seven days and then categorise the offensiveness of the video. Repeat this exercise every 3 months.

Hmm… I quickly realised that this might be a Quixotic mission and that AI/ML might provide a better service but still only a partial one. There would be no perfect service. People decide what is offensive, not machines. If the service only considered some contexts then the people who controlled the machines and trained them on those contexts would be the ones who decided where it was useful. Swear word data isn’t like the location of bus stops or the list of transactions in a bank account. The context is even more important.

This is one of the challenges of the web and providing data and services for it. The web is pervasive. It interacts with the physical world in many places. It appears in multiple contexts. I use the web to watch broadcast news, like that regulated by Ofcom. I use it keep up to date on politics, where the unparliamentary rules are useful. I talk about football, and the Oystons, on message boards. I keep up to date on current affairs, and feel helpless at the levels of hate speech deployed at people in the UK and abroad. I chat to friends, both publicly on sites like Twitter and Facebook and also privately in messaging applications.

Datasets and services that reduce offensive content on the web will need to cater for all of these different contexts, and more. Even if they do, some people will still work around them. Data and technology may be able to help the problem but it will only ever be part of a solution to something that is fundamentally a more human problem. Our need to express our emotions in language.

Sorry mum

It was clear from my investigations that we could usefully create data about swear words, i.e. words that are offensive. That the need for this data came from people who swear, people who didn’t want to swear and societies & communities trying to decide the boundaries between what was offensive or not. That it would be useful if the research and rules for deciding on what was offensive were open. And that if people could collaborate to decide on what was offensive that the data would be more useful because it would cater for more contexts. But it was also clear that while technology creates new possibilities to reduce offensiveness that people will still adapt to achieve the goal they want. So it goes.

The other thing that was clear from the talk was mine and my audience’s squeamishness with some of the words. In my case it was certainly because of one of my most important contexts: my upbringing and my family. I’d like to end this post the same way I ended the talk by apologising to my mum. Sorry mum.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — -

The questions from the audience showed the importance of context

At the end of the talk at the ODI the audience raised several points about offensive language that had not been covered in the talk, such as the use of racial and religious slurs. I was already covering a wide topic. Racial and religious offensiveness cover even more ground. I couldn’t cover everything.

Image from The Wanderers, based on a book by Richard Price. The film includes a fantastic scene in a 1960s New York school where people of different religions and ethnicity try, and fail, to remember all of the offensive names they have for each other.

I did find it interesting that the audience in the room hadn’t heard of some of the words in the list. Particularly choc ice, blood claat and bum claat, words that in my — white, middle class, mostly Northern England and South London experience — are used against black people or in black communities. In the case of the latter two more specifically within Jamaican communities.

That people hadn’t heard of these words says something about the context of the audience. A context where those words may not have been seen as offensive. Perhaps next time I talk on this topic I should try and sneak in some offensive language from different contexts to see what happens.

Watch the original talk or read the slides

If you want you can watch a recording of the talk (which includes some swear-a-long fun):

You can also see the presentation on slideshare or google slides, whichever your prefer.

Open your effing data presentation 2017 from Peter Wells

Will bike sharing benefit from learning some data lessons from other parts of transport?

2017-06-12 / peterkwells / 0 Comments

This morning’s news that bike sharing firm Mobike was launching in the UK caught my eye.

The story was full of excitement about the convenience and how cycling can help people becoming more active and improve air quality by reducing the number of car journeys. But the story also featured challenges: piles of bicycles on pavements and congested cycling lanes in cities not expecting an increase in traffic. Transport authorities seemed to be caught between the desire to seize the opportunities and head off the complaints.

But one thing that was missing from the story was how familiar the challenges are and how cities are already tackling them in other areas. At the Open Data Institute, where I work, we like to talk about design patterns for policies that use data to create impact. Some of the patterns needed to make bike sharing better are already in use elsewhere. Bike sharing companies and cities can learn some lessons from cars, buses and other cycling apps to tackle the challenges a bit faster and grab the opportunities a bit sooner.

What is bike sharing

Mobike is one of a number of firms offering bike sharing services. The service is simple. You download a smartphone app, request a bike, go to the location shown on the app, get the bike, cycle to where you want, leave the bike somewhere convenient and pay your fee.

The bike sharing operator will need to process orders and payments, maintain a fleet of bikes and predict demand so that they can move unused bikes to where they are likely to be needed.

The local transport authority has a different task. They need to maintain transport infrastructure to suit the different modes of transport (walking, cycling, cars, buses) that meet the needs of different groups of users (able-bodied people, people with disabilities, tourists, residents, business travellers) at different times of the day. It’s great that cities are welcoming trials of another option in this already complex system.

Some of bike sharing’s challenges can be helped by better use of data

Some of the challenges posed by bike sharing are already being helped by data. Better use of data can tackle them more easily.

Neither cyclists or bike sharing companies want congested cycling lanes. It will make it hard for people to get where they want and risks increasing accidents. That will reduce the number of people who cycle, and reduce the profits that bike sharing companies might make. Giving transport authorities access to data about where people cycle and where accidents occur will help them meet demand and create safer roads. Giving cyclists data about congested cycling routes will help them make better decisions about where to cycle and when.

The bike sharing companies don’t want piles of unused bikes on pavements. They make money when the bikes are used. Bike sharing companies won’t have data on how congested a pavement is because of other traffic: for example bicycles belonging to a competing bike sharing company or because of pedestrians trying to get to lunch. But that congestion can damage their reputation. Giving bike sharing companies access to this data will help them make better decisions about when to move bikes. Giving transport authorities access to this data will help them understand the impact of bike sharing on other types of transport.

Data isn’t a magic bullet. You can give better information to cyclists, bike sharing companies and transport authorities but there is no guarantee that they will use it or that they can even use it quickly. But it can help. We’ve seen it already. The transport sector still has lots to do to improve data but it is a sector which is ahead of most.

Learning the lessons

Many transport authorities already publish open data about congestion, accidents and road closures. Google, Uber and Strava are starting to publish aggregated open data about usage of their platforms for car and bicycle transport through the Mobility, Movement and Metro programmes. By making this data openly available then everyone can improve the service that is provided to car drivers, taxi passengers and cyclists. Openness is essential. It means that cyclists and taxi drivers can use a whole range of services to decide on a route while transport authorities can easily combine the data to give advice or decide where to build new capacity.

Pedestrians can report congested pavements using services like MySociety’s FixMyStreet. The reports are published openly so could be used by the bike sharing companies and transport authorities. If the bike sharing companies all publish aggregated data about where their bikes are left then the decision making can be further improved.

The challenge of bad data business models

Ah, I hear some readers say, but surely there’s a problem? If the bike sharing companies openly publish data about where their cycles are and the routes that people take then won’t that mean that other companies will use that data to compete with them?

Well yes, obviously. But good competitors will know already. It is fairly cheap to get a few people, or a camera or another form of sensor, hanging around major destinations to take pictures of bicycles that can be counted by machines.

Ah, I hear other readers say, but surely the bike sharing companies will be planning to sell the data as part of the data monetisation strategy that everyone is recommending nowadays?

Well yes, they may think they can sell it. But data monetisation is not a very clever strategy for these companies. That isn’t only because their users might prefer the data to be used to benefit their community but also because their users are carrying the smartphones that they used to get the bike. Google, Apple, and the telecoms operators have similar same trip data. It has negligible value.

In a world where data is abundant then data monetisation will work when you can add value to data. For example, it will work for data aggregators, like TransportAPI and ITOWorld, but only occasionally for data publishers. Instead bike sharing companies should open up the data to improve the service.

Data is not oil but it is infrastructure

In the 21st century when it is so cheap to get and use data, business models based on the scarcity of data are generally going to fail. That is one of the many reasons why “data is oil” is an utterly utterly terrible analogy. The smart bike sharing companies will open up aggregate data about usage and the locations of bicycles. They will compete on the quality of services. That competition and focus on services can benefit their users and the wider transport network.

But what happens if bike sharing companies aren’t smart? If they choose to impair the service they give to their users because of a lack of understanding of data and bad business models?

Well, that’s the final data lesson for this post. Data is a new form of infrastructure. Governments are realising that this infrastructure needs to be as open as possible, while respecting privacy, so that businesses can be built and services improved.

The UK government and local authorities tried, and failed, to persuade bus companies to open up data so are now legislating to force it to happen. That legislation will improve services for bus passengers by making it easier for services like Google Maps, Apple Maps and CityMapper to help people decide on their journey.

If the bike sharing companies don’t decide to be smart then I suspect the genuinely “smart cities” will make the decision for them. Bike sharing companies will be welcomed, but only the companies that decide to provide better services by opening up their data. Smart companies will learn the lessons and get ahead of that particular game.

Hacker Noon is how hackers start their afternoons. We’re a part of the @AMI family. We are now accepting submissions and happy to discuss advertising & sponsorship opportunities.

If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!

Most Blackpool fans will boycott Wembley, you should know why

2017-05-21 / peterkwells / 0 Comments

Next week Blackpool and Exeter will play a game of football at Wembley to decide which team gets promoted to the third division. Most Blackpool fans, including myself, will boycott the game. They will boycott because of the actions of the owners, the Oyston family, who have threatened and taken legal action against many of the club’s fans.

Football is a sport that entertains billions of people around the world. It helps brings people and communities together. Blackpool FC doesn’t. All of my family boycott the club. It is tainted by the Oystons and their actions.

A big game like this would normally be an opportunity for families and the town to unite, whether in victory or defeat. Instead this game will leave the town confused and frustrated, thinking of what could have been. Blackpool’s owners don’t get the damage that they have done to football, the town and the fans.

If you are thinking of going to Wembley then, unless you are an Exeter fan, please don’t. While if you happen to be watching or reporting on the game then it’s important that you understand the reasons for the boycott and that you tell others about it.

That way you can help support all of the Blackpool fans who are trying to heal the damage of the last few years and create a club that all of the fans can support.

This isn’t easy, it hurts

It’s hard not to go and watch your team.

I remember Brett Ormerod’s goal at the Millenium Stadium when we got promoted in 2001. I watched it with two very quiet friends who were in town to support Preston North End in their playoff final. They were even quieter when Bolton took Preston apart in their game two days later.

In 2007 I was at Wembley with my future wife and a group of friends to watch Keigan Parker’s stunning goal help Blackpool beat Yeovil at Wembley to get promoted to the Championship. We met up with a colleague from Yeovil afterwards to share memories and talk about the next season.

In 2010, when Brett Ormerod scored the winning goal to take Blackpool up to Premiership, seven of my family were in attendance along with over 36,000 other Blackpool fans.

This game will create no such memories or reunions for me or thousands of Blackpool fans. I boycott. That hurts.

I boycott because of what the Oystons have done

The Oystons have wasted the opportunity provided by a £90m windfall from Blackpool’s recent season in football’s top division. Much of that windfall has been loaned from the club to other companies. The terrible waste of that money is damaging the club but that is not the main reason I boycott.

One 67-year old Blackpool fan had to pay £20,000 for a private Facebook post seen by 34 (yes, you read that right. Thirty. Four) friends. Fans raised the money to pay the fee.

My boycott is because the Oyston family have abused fans, taunted them and taken legal action against them. An unknown number of legal actions are ongoing. These legal actions carry a large human cost.

The Blackpool Supporters Trust have reported on the human cost saying:

Some individuals have lost their jobs, businesses are in jeopardy, relationships with partners have broken down and health has suffered.

That damage cannot be undone. The Oystons have to go before many fans go back.

Thousands of Blackpool fans are trying to make things better

The fans are working to get the Oystons out of the club and turn Blackpool FC into something that we can be proud of. A club that puts football first and that all fans can support.

While the Oystons remain there is an ethical boycott in place. We call it NAPM: Not A Penny More. The boycott works. The official attendance figures have dropped dramatically and overstate the actual attendance. In some games last year the actual attendance was three thousand lower than the official attendance. The Oystons listen to money. The drop in income will hurt them.

Teams make their way out pic.twitter.com/PxUxR6h7P6

— Matt Scrafton (@matt_scrafton) May 14, 2017

The missing fans are still there and still passionate. Six thousand people joined the most recent protest march with Blackpool fans joined by other football fans from around the country. The Blackpool Supporters Trust have offered to buy the club but the owners have refused to enter negotiations.

While this happens the local council and its leader have stayed curiously silent and the footballing authorities have sat on their hands, rather than trying to save the club and help the fans. There is an ongoing court battle over ownership of the club but many fan’s only real leverage is to choose to boycott. Our boycotts and protests can help motivate the Oystons to leave and others to act.

You can help Blackpool football club

Some fans will recreate part of the Wembley experience by watching on a giant screen that they have hired. Others will join friends down the pub or stay at home. A few will simply ignore the game altogether, the Oyston’s actions have led to them falling out of love with the club and the game.

I know that the short-term pain of missing games is morally right, I cannot give money to a club that sues its fans. I also know that it will help get the Oystons out of the club. The declining revenues, empty seats and protests at Blackpool tell a tale. The tale of a football club whose owners are not wanted and not welcome, who are damaging the game and the town. Eventually they will run out of money or the authorities will intervene. The football league are starting to realise that their rules needs to change so that they can help address the problems at Blackpool and elsewhere.

In the meantime the best way to help Blackpool football club is to encourage people to boycott. I hope this post helps persuade some people who might be wavering and helps both journalists and oppositions fans who haven’t heard of our protests understand why Blackpool fans boycott and why that matters.

We boycott because of the actions of the Oystons. We boycott to help save the club.

Roman roads and data infrastructure

2017-04-10 / peterkwells / 0 Comments

I occasionally walk around, wave my arms and proclaim:

data is infrastructure, just like roads

I alternately blame and praise the brilliant Jeni Tennison for this strange affliction. I praise Jeni for coming up with the wonderful analogy of roads for data, I blame her for infecting me with the bug of excitedly talking about it to anybody and everybody so that I can learn from what they think.

A clip from one of Jeni’s talk on data infrastructure.

I recently proclaimed that data was like roads to a friend who has a degree in classics and spent a career teaching in primary schools. She is very well-read.

My friend asked me if I thought that as a society we were well advanced in building our data infrastructure.

No, I said, it’s only been a few decades since the invention of the internet / web which led to the current massive growth in data, I suspect it will take a decade or two before we learn how to do data things well.

I think you’re right, she replied, after all the data infrastructure that you describe sounds a lot like the Roman roads and it took us a couple of millennia to start getting roads right.

Really? I said. Roman roads? That sounds interesting….

Roman roads were for the economy as well as the military

A Roman army in an Asterix comic. They will have tried, and failed, to conquer. Copyright René Goscinny and Albert Uderzo

Our usual vision of a Roman road is either a muddy field being dug up by a team of archaeologists or an army of Roman soldiers marching to try and conquer a new land. But Roman roads were used by other people too. They were an important component of the Roman economy.

People transported goods along them for trade and materials for building new houses. Books have been written about the impact of roads on Roman Egypt and Italy — they had sophisticated pricing models, integrated their road with other modes of transport and they evolved governance arrangements to manage the development of their roads.

But Roman roads were not only for armies and traders. They were also used to transport messages, taxes and people. Along the cursus publicus, or public way, there were mansios, or waystations.

Data is not really roads

Before I go further into this tale I should be clear that I don’t really think data is exactly like roads. It’s an analogy. All analogies are imperfect. But I do think data is becoming a new, strange and vital form of infrastructure for a 21st century society. It’s very important that we debate and learn how to get the best out of it.

A first edition of the first UK Highway Code by Mikey Ashworth, CC-BY-2.0

The analogy of roads helps break people out of the usual mindset when thinking about data. The frequent comparison with oil is particularly misplaced.

The analogy of roads is much more relevant. The importance of maintenance; the need for big, open roads between large towns and the value of smaller roads for villages; the dangers of toll roads and expensive or complicated licensing; and rulebooks for how to use the roads.

It’s a pretty decent analogy, as analogies go, but my friend had started talking about Roman roads.

Roman roads helped co-opt other economies

Mansio were set up along the roads. They were maintained by the Roman government and used by officials and armies. Officials from the government and their animals could sleep, get washed and get fed. Many other people could use the mansios too but they would have to pay for the privilege.

The money people paid would go to the upkeep of the mansios and to the running of the cursus publicus. The cursus publicus was a transportation system, both for people and for messages. Officials and their information would travel for free. Everyone else would have to pay. It was a massive toll road network set up across a range of nations with preferential access for one group of people.

Other people would pay because the Roman roads were so much better than the roads they could build themselves. There was no real competition: if you wanted to go from A to B you had to go Roman. As a result many of the mansio gradually grew into towns.

A Roman coin showing Marcus Aurelius. Copyright: CC-BY-SA 3.0 by Rasiel at English Wikipedia

The impact wasn’t just to preferentially improve the economy of one group of people, the Romans, and their towns but also to help impose Roman culture and standards by making people use their language and their currency. It is a myth that the width of our railways comes from Roman roads — that was due to a different bit of infrastructure, the railways that were invented in the North of England — but many European town names and locations still reflect their Roman origins.

After telling me the tale of Roman roads my friend turned to me and said: isn’t that what you just described? Aren’t Google, Microsoft, Amazon and those big government agencies a modern cursus publicus?

Oh, I said, yes they are.

What have the Romans ever done for us

As I noted earlier “data is roads” is just an analogy and IANARH (I am not a Roman historian) but the similarity of the Roman system to our current data infrastructure was both striking and reassuring.

The Roman road system was striking in its similarities, even down to people bemoaning what the road builders have done while using their roads, recognising that what they’ve done is actually very good and realising that in many cases it couldn’t have happened without them.

It was also reassuring. History is full of repeated patterns and perhaps the current stage of evolution of our data infrastructure is a necessary stage in a pattern that repeats when new infrastructure emerges.

We learnt that roads needed to be run as a system

Roman roads might have started off as a form of military and economic conquest but we gradually learnt more about the need for roads to be run as a system for the good of everyone in society. This took a while, as did our understanding of government’s role in making that happen. The case for this involvement evolved as we understood the decisions that needed to be made.

A thousand years after the fall of the Roman empire the UK decided that governments should take a stronger role in roads with the first Highways Act in the UK. 300 years later the Rebecca Riots against toll roads contributed to the gradual removal of charges and the transfer of responsibility to central and local government for maintaining most roads. Private roads, for example the path to your house or the bit of road to a local factory, were not transferred but governments make sure that we have a duty of care to visitors and workers.

The Rebecca Riots, courtesy Wikipedia and the Illustrated London News

The UK still builds some toll roads but, generally, they are on a lease. For example the M6 toll road near Birmingham will be a toll road for 53 years until the initial investment is paid back. Meanwhile in 1978 countries worked together to develop the Vienna Convention on road signs and signals to standardise rules of the road. Common standards that help with safety and make it easier for people in one country to drive to a location in another whether it’s for pleasure or business. And at this point we come full circle back to my road and data analogies which tells me that it’s time to stop…

But one final thought. Many of the major roads in European countries are still based on the old Roman ones. I wonder if in 2000 years our data infrastructure will still show signs of its 21st century origins and the decisions of the people who are building it now?

Hacker Noon is how hackers start their afternoons. We’re a part of the @AMI family. We are now accepting submissions and happy to discuss advertising & sponsorship opportunities.

If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!

Cat data is complex, and that’s ok

2017-03-12 / peterkwells / 0 Comments

Last year I openly published data about some of the cats that work for the UK government. I ended up giving a talk about it. When publishing the data and giving the talk I skipped over the potential data protection and privacy issues.

Some of those potential issues came up again recently when our family cat, Bugsy, was being transferred to our new home. I was nervous about the cat arriving safe and on time. A friend asked:

can’t you publish some data showing the cat on his journey?

Such a short and simple question. This is my long and complex answer. Most of my friends are patient people.

This post might sound like it is going to be whimsical —ok, there will be some cat whimsy…— but there is a serious point. Publishing and thinking about cat data helped me think and talk about other data things with more people.

Thinking and talking about data protection, ownership and control for cat data will have the same effect. It is pretty important that more people know how complex they are.

This cat data deserves data protection

Different countries have their own data protection and privacy laws. Personal data can be hard to define but at the Open Data Institute we encourage people to look at relevant legislation and start by simply saying:

Data from which a person can be identified is personal data.

If data can be combined with other information to identify a person, that data will still be personal data.

If there is personal data in a dataset then we should consider relevant data protection legislation and the univeral human right of privacy.

At this point I expect that lots of people reading this post will be thinking that a cat is not a person so neither the personal data definition or human rights do not apply.

This is true but, like other animals, cats do have rights. Some people argue that pets are becoming people, in a legal sense, and that animals deserve democratic representation. Perhaps cats do not have data protection rights today but if that might change in the future then perhaps I need to worry about it today.

A cat called Paddington chasing its own tail. Picture by Bill Abbot, CC-BY-SA.

Whilst this would be a fascinating topic to explore unfortunately, to paraphrase a recent article by Luciano Floridi on the rights of robots and artificial intelligence, I’m in danger of chasing my own tail when I should be focussing on the current opportunities and challenges with data that affect people. People like me. Our cat wasn’t moving home in a few year’s time, he was moving now; and I was nervous.

There is a simple reason why I need to think about data protection if I was to publish this cat data. Whether cats realise it or not, their data can refer to people. My cat lives in the same house as me. If you knew the destination of its journey then you would know where I live. If you knew the date when it was being transferred to a new home then you might be able to guess that my old or new home is empty. Etcetera.

So if I was to publish data about Bugsy’s journey I would need to think about the impact on privacy using a methodology like the one provided by the UK’s Information Commissioner’s Office (ICO) before I published the data.

Ownership of cat data is complex

I occasionally hear people saying that defining a legal right to personal data ownership will make this process easy. My privacy, my data, my choice. I doubt my cat cares about human laws but, according to the law, I own him. So I might legally own data about my cat and would have the legal right to choose to publish it. Unfortunately data ownership is not that simple and nor is cat data.

How is my cat’s identity defined? Some cats have microchips, and Edinburgh University have even given a library card to a cat so it can prove its identity and demonstrate its entitlement to borrow books, but our cat just has a phone number on its collar. Is that sufficient?

Defining legal ownership of cats in data seems simple.

Meanwhile Bugsy is a family cat. He is owned by me and my wife. It might look like that joint ownership can easily be defined in data, but the world is more complex than my simple model. How is my identity and that of my wife defined? How would we verify our identities to say that we are allowed to track our cat on his journey? Identity management is hard.

And once we get past those issues I might find that my wife disagrees on how the cat’s data can be used. We both own and live at the same house that the cat is being transferred to. The data refers to both of us. My wife might think my nervousness is utterly ridiculous and not worth risking our privacy for. There have been several legal disputes over the ownership of pets. I don’t think it would calm my cat moving nerves if I was to take my wife to court over ownership of cat data.

Meanwhile we’re still missing something quite important. The cat isn’t travelling alone on his journey. He is being transported by an employee of a company. What about that company’s potential rights to own the data produced by their service? What about the cat transporter’s privacy?

Controlling cat data

At this point, when answering that simple question from a friend about publishing data about Bugsy’s journey to make me feel less nervous, I started to talk more about consent.

Data protection isn’t just for the online world. We also need to think about the offline world and the billions of people who don’t use computers.

Giving people choice and ongoing control over how you use their data is becoming more important. It’s one of Tim Berners-Lees three challenges for the web. Some trading blocks, like the EU, and individual nations, like the UK, have decided that it is necessary to put in place new legislation that strengthen people’s rights over data. Consent is not always necessary but the ICO recently published some draft guidance on consent under that new legislation which I could use to help publish cat data.

My wife knows quite a bit about data so could give informed consent which I could record. I could also ask the cat transporter and their employer if they were willing to consent. To be clear I would want to give the cat transporter the choice of saying no. A world where people who transport cats have less privacy than other people does not sound a sensible world.

Unfortunately given the impending journey I did not have time to think about or research the cat transporter’s needs and skills. The ICO’s guidance says that I can assume that “adults have the capacity to consent unless you have reason to believe the contrary”, and I knew how to be open about how I planned to use the data, but without more research I would not know how to design something so that the cat transporter could choose whether to consent, or not. I might mistakenly assume that an online only service was good enough, despite a large proportion of the UK population having no access to the internet or insufficient skills to use it. The cat transporter could be one of those people.

And all I would have achieved by this point was possibly gaining consent. I would not have given the cat transporter control over the data about their journey. With that control they could reuse the data for another purpose, such as reclaiming their petrol costs or seeing what cat data tells us about people moving house around the country. My wife, the cat transporter, their employer and I all had rights to the cat data and should all be able to have some control over its use.

Sometimes you need to keep things simple

At this point my wife and friend both firmly interrupted me and told me I was not being utterly ridiculous but being completely and utterly ridiculous. I was trying to design a perfect solution that would work for many cats and purposes, rather than keeping things simple and starting with a solution for a particular problem. My nervousness about our cat.

My wife rang the cat transportation company and asked them to text us a couple of times during the journey. They agreed, of course. Sensible wife.

Data is complex, and that’s ok

Now you might read all of this and ask:

if we have to think through all of this complexity everytime we’re thinking of publishing data how will we ever build anything?

I don't think the cat is happy I've come home. pic.twitter.com/w11ZwGPv0i

— Peter Wells (@peterkwells) February 24, 2017

The team at the Open Data Institute, where I work, do the hard work to try and make data as simple and easy as possible so more organisations can get data to people who need it.

That requires us to work on lots of things including how to publish data; how people will search for it; the skills they need; how to use it in organisations, large and small, or whole sectors; and how to get data to benefit everyone. Lots of other people do similar things.

But sometimes I wonder if we and other people can make it sound too easy.

So when we’re encouraging more people to do wonderful things with data then as well as the brilliant possibilities we also talk about the challenges using both real examples and whimsical ones like the ones I faced with my cat data. Whimsical tales sometimes help convey simple messages.

We can build a better future with data but we need to solve problems and be realistic about the complexity if we are to build one that works for people. Data is complex, and that’s ok.

Write to your MP about reforming governance of the Football Association

2017-02-05 / peterkwells / 0 Comments

On 9 February 2017 UK politicians are debating governance of the FA (Football Association) – the governing body for football in England, Jersey, Isle of Man and Guernsey. The debate follows the FA’s failure to implement UK government’s best practice for sports governance. English football currently has big problems, this is a chance to make a difference.

You can send a form letter to your MP about the debate using the VoteFootball site but it is more likely to be effective if you send a personal letter using the WriteToThem site produced by the lovely people at MySociety. It is likely to only take 5 or 10 minutes. Write about what you know and feel. Be concise. Give links to more detail and evidence. Be polite. Ask for a specific action.

If a relative or friend can’t use the WriteToThem site then they can call their MP. You can help your relative/friend by looking up their MP’s contact information on the Parliament site and passing it on.

Below is what I sent to my new MP.

— — — — — –

Dear XXXX

On 9 February the House of Commons will be debating the following motion:

That this House has no confidence in the ability of the Football Association (FA) to comply fully with its duties as a governing body, as the current governance structures of the FA make it impossible for the organisation to reform itself; and calls on the Government to bring forward legislative proposals to reform the governance of the FA.

The motion has been bought by a group of Labour and Conservative MPs (Andrew Bingham, Christian Matheson and Damian Collins).

Can I ask you to attend the debate and support the motion?

There have been many governance failures of the FA, and other English governing bodies. I am particularly concerned about the lack of representation for fans and the lack of action against the owners of football clubs who act against the interests of the game, the fans and the communities in which the clubs are rooted.

There are numerous current examples of fans protesting against and boycotting their clubs because of the actions of their owners. For example, Charlton Athletic, Coventry FC, Blackburn Rovers, Leeds United and my own Blackpool FC.

In the case of Blackpool FC despite a £90m “windfall” of Premiership money the club has fallen 3 football divisions in 5 years and could not even put out a full squad at the start of the 2014/15 season. Much of the money has been loaned to Segesta, a company owned by the Oyston family who have a controlling interest in the football club. (1)

The club has taken legal action against fans and abused them. The trust reports that legal action has meant that:

some of the people caught up in this situation ha[ve] been seriously impacted — two cases of cancer, a stroke victim, depression, loss of a baby and an attempted suicide all in the last twelve months. (2)

The fan’s response has been a long-standing and effective boycott (3) coupled with the growth of the democratically run Blackpool Supporter’s Trust (4). The boycott is primarily due to how the club treats its fans, not its performance on the pitch.

Neither the FA, the local footballing authorities or even the local council (5) have taken action.

Reformed governance of the FA which provides transparency, accountability and gives power to fans will help alleviate the situation at Blackpool, and other clubs, and can reduce the chance of similar cases happening again.

Please support this motion to help make that happen.

Yours sincerely,
Peter Wells

(address)

(1) http://www.dailymail.co.uk/sport/football/article-3030302/How-Blackpool-laughing-stock-sorry-story-Oyston-mess.html

(2) http://blackpoolsupporterstrust.com/Site/LatestNews.aspx?NewId=46

(3) http://blackpoolsupporterstrust.com

(4) https://medium.com/@peterkwells/football-attendance-figures-are-inaccurate-and-don-t-tell-the-whole-story-b4e3f4859648#.xotfjo5wc

(5) https://medium.com/@peterkwells/the-curious-silence-of-blackpool-council-and-its-leader-c1b9be675fde#.34wdwv6i4

Make data great again

2016-12-29 / peterkwells / 0 Comments

Data is becoming increasingly important to our societies. We live in an age of data abundance and, without many of us realising, data has become a new type of infrastructure and a critical one at that. The age of data abundance has led to brilliant new services and can help our societies tackle challenges such as climate change and population growth, but it also creates risks to privacy and concentrations of power.

Societies need to be able to debate what this age of data abundance means for them. People need to make decisions about the relationship between individuals, communities, societies and data. We need to pick a future vision for our relationship with data and then make steps towards it. Many governments and societies are having this debate now.

In my job I put forward the Open Data Institute’s position on those decisions while also trying to encourage a more public debate. I want a debate because I, and the lovely people I work with, want the decision to be made by societies around the world.

To make this debate as broad and informed as possible, I need what I say to be understandable by as many people as possible. I try to use plain language and frequently test new language and concepts to see if they are understandable. Sometimes I test things through tweets or blogs, like this one, at other times by talking with people from differing backgrounds and perspectives.

By testing, listening and learning I have made some of the language more accessible but I’ve also realised that something was more important than I first thought: politics. Both my politics and that of others.

Let me try and explain.

Choices about data

Sometimes people say they want to help people make better choices about data. I did that a few times in this blog about an open future for data.

I was talking about the ideas in that blog with a left-wing British politican who stopped me mid-sentence and asked if I was a Blairite nowadays. No, I replied. “Then why are you using the language of Blair’s choice agenda?”, they asked.

image copyright the BBC. Taken from a blog stating that the comedy show Yes (Prime) Minister, was the most cunning political propaganda ever conceived

Further testing of the language caused another person to recoil and suggest that if I kept talking about choices I might be accused of being a secret Thatcherite pushing the theory of public choice. Hmm….

I’d used the word ‘choice’ because I thought it was plain language but it was clear that the decision risked putting in place a political barrier for some of the other ideas in the blog. This is a problem.

Data is political

When thinking about and debating technology and data with other technologists it can be easy to fall into a trap of thinking that every decision can be based on empirical evidence, that there is a single right answer and that we can make that right answer a reality by designing and building the right technology. This is nonsense.

In our debates about data we need to decide issues of access, ownership, regulation and the relationship between citizens and the state. These are political decisions.

Whilst we might have individual opinions about data we need a state and legal system to help put decisions into practice. States will allow technologists to innovate and try things out but there comes a time when existing legislation will be more strongly applied or new legislation will be put in place as society’s needs change. This happened and continues to happen with road traffic, it will happen with data.

By broadening the debate we are helping that decision to be made democratically. Democracy might have seemed under strain in some countries in 2016 but as Churchill said:

Indeed it has been said that democracy is the worst form of Government except for all those other forms that have been tried from time to time

To put it more simply politics and democracy is important and data, as with most things, is political.

Words already carry political meaning

The “white heat of technology” makes me think of Harold Wilson and the 1960s UK Labour party. Because of my political history I have positive feelings about the phrase despite the speech being followed by the scrapping of several high-profile technology projects. Image copyright PA.

Words are a tool political people use to reach our hearts. Sometimes those words are a catchy slogan. At other times it’s a frame: a guiding metaphor or image for a political argument.

Political slogans and language are designed to appeal to a group of people, build on existing beliefs and make them choose a particular path.

Some words carry a particular meaning in the present because they have been used in a political context in the past. Marx said it more poetically:

The tradition of all dead generations weighs like a nightmare on the brains of the living.

The word “choice” resonated amongst some people involved in British politics that I spoke to because of those traditions and their political history. It will have bought back nightmares for some and heavenly dreams for others.

Data is not about left or right wing politics

In economic terms each of these cakes is rivalrous: only one person can eat them. Cake is not like data, multiple people can use data at the same time. Picture of cake by Hani AlYousif, CC BY-NC-ND 2.0

Our societies and political systems are used to making political decisions about many types of resources, for example oil or water, but data has different qualities to the physical resources that are embedded in our political systems, debates and legislation.

To give two regularly used examples: data is non-rivalrous, unlike a piece of cake many people can use data at the same time, and data benefits from network effects, it becomes more valuable as more people use and maintain it.

These differences are one of the reasons the team at the Open Data Institute talk about data as analogous to roads:

Data is infrastructure. Just like roads. Roads help us navigate to a location. Data helps us make a decision.

The “data is roads” analogy breaks people out of the traditional mindset. It helps open their minds to thinking differently.

I think that, as with the web, these different qualities mean that a closed-open axis is a more useful way of thinking than the traditional left and right-wing political axis.

But it will be harder to get people to think about the decisions along that closed-open axis if our words and ideas cause them to think of old left and right wing political battles.

Take back control of data

Data has many other different qualities to other resources. One that is becoming increasingly evident and important is that data is sometimes about identifiable people, sometimes it isn’t and sometimes it’s a bit complicated.

Much of the current debate about data is dominated by personal data: the stuff which is about identifiable people. Many people believe that there is an asymmetry of power and privacy as data about us is controlled by governments and corporations.

Tav Kotka, the Estonian Chief Information Officer, at MyData 2016 in Helsinki. Watch the full video.

Tav Kotka, the Chief Information Officer of Estonia, recently gave a talk in which he broached the idea of adding a fifth freedom to the EU’s existing four freedoms for free movement of goods, workers, services and capital. The talk was mostly about personal data and the concept of personal data stores that could allow individuals to control how data about them is used.

Whilst I agree that more personal control over personal data is important the talk bought up memories of Margaret Thatcher and my teenage political nightmares. The talk did not mention society’s need to access and use that data. Taking back control of data by giving control to individuals misses out the challenges of digital inclusion and the role of other important parts of society like families, communities and nations. Different levels of control, rights and responsibilities are likely to need to given to these different groups. To give just one example vital medical research and national statistics need to use large amounts of personal data, this can’t be neglected or left solely to the decisions of individuals.

But, as I realised, this time I was the one allowing my political history to do the interpretation for me and I was the one who wasn’t listening to the underlying argument. Tav Kotka was using language that built on his political history while talking in English to a Finnish audience. Even though I work for a global organisation my initial reaction was from a UK perspective. My bad.

The political debate about data is happening now

The EU is currently discussing complex concepts such as data control and data ownership through the free flow of data initiative. Major geopolitical organisations, like the EU, can have a large impact on countries outside their membership, the UK government has committed to following current EU data protection regulation after it exits the EU. That EU debate involves politicians from multiple countries, each with their own rich histories and perspectives. There are many other debates in countries around the world.

If you want to help build a great future for data then as well as building new services you may want to get involved in either this or other multinational, national and local debates.

But if you do, remember to think about politics: both other people’s politics and your own. That way you will be best placed to help people think about the decisions not in terms of traditional left and right-wing politics but instead in terms more suited to the different challenges and possibilities of data.

Words from leasehold and commonhold reform APPG

2016-12-15 / peterkwells / 0 Comments

Approximate words spoken at the meeting of the the UK Parliament’s All Party Parliamentary Group (APPG) on residential leasehold and commonhold. The meeting was chaired by Jim Fitzpatrick MP and Sir Peter Bottomley MP. There were 60–70 people in the room: MPs, Peers, conveyancing firms, big homebuilding companies and people suffering under bad leasehold terms.

Yes it’s 900 years away but why should anyone produce or sign a contract that commits them to spend this? (source: Telegraph)

I spoke after Patrick Collinson from the Guardian, who has written extensively about leaseholds in England and Wales and the issues some leaseholds cause for people; Bob Bessell of Retirement Security; and Phillip Rainey QC a specialist in property litigation and expert in leaseholds.

Phillip discussed various policy options to tackle the challenges. The options includes banning ground rents or limiting how much they could increase in value and many other subtle tweaks.

Hearing tales from Patrick Collison of @guardian of people suffering after buying house under leasehold https://t.co/cswL4xRvo7

— Peter Wells (@peterkwells) December 14, 2016

I then had 5 minutes.

Hello, thank you for inviting me. I’m from the Open Data Institute (ODI). You may not have heard of us. (murmers of agreement)

We were founded 4 years ago by Sir Tim Berners-Lee, the inventor the web, and Sir Nigel Shadbolt. Our CEO is Jeni Tennison, she apologises for not being here. So do I as I’ve ended up creating an all-male panel. That’s bad.

We are global. We connect, enable and inspire people to innovate with data. Or “to get stuff done that make things better by being more open” as I sometimes say.

I am not a housing or leasehold specialist, my job is to get data to people who need it. Leasehold Knowledge Partnership are part of our current UK startup programme. They’ve been helping us understand the problems in leasing, we’ve been helping them understand whether more data can help.

Freeholds sold without leaseholders knowing ("who owns my house?"), others trapped with ever-escalating ground rents

— Peter Wells (@peterkwells) December 14, 2016

At the ODI we think of data as a new form of infrastructure. It has become essential infrastructure without us realising it.

Like most physical infrastructure – for example roads – data creates most value when it is as open as possible while respecting privacy.

Patrick C suggests big house developers need to change practices, eg buyback freeholds, change terms, only recommend independent lawyers

— Peter Wells (@peterkwells) December 14, 2016

When data is open and available for anyone to use it is easier for people to use it to make decisions and solve problems.

Take leaseholds. Let’s imagine if more information was open while respecting the privacy of homeowners.

People expect easy access to data in the web age. Many homebuyers use sites like RightMove and Zoopla as they look for a home. Opening up leasehold data would enable those services to help people make an informed decision. For example they could compare terms with other properties, leasehold or not, in the area and see what’s reasonable. Some of the cases Patrick mentioned happened because people lacked information when buying a home.

Phillip Rainey QC asks whether leaseholds are a means to an end (buying a flat) or are we inventing an asset class?

— Peter Wells (@peterkwells) December 14, 2016

Conveyancers and estate agents would have access to more data too. They could get things done faster and give better advice to homebuyers.
Researchers would be able to model the market; help people understand how it is working and suggest improvements
Legislators would be able to get better information about problems, where legislation is needed or where soft power could be used to influence things
With better access to data government could test a policy idea, like the ones Phillip suggested, in a region before deciding whether to roll it out nationally

"If we can define the moon in legislation we can define ground rents and how they can be used"

— Peter Wells (@peterkwells) December 14, 2016

Much of this data is available but it is locked away. In government offices, in the offices of house building firms, in law firms or in contracts held by leaseholders and freeholders.

Some of our big public registries and institutions – things like the Land Registry, Ordnance Survey, the Met Office — were created to make this type of information available to people who need it but it feels like they haven’t adapted to changing times and 21st century needs.

Getting this data open can take time and cost money. Not that much, technology can be cheaper than some people might tell you. But getting the data open and using it to change markets, like leasehold, can also affect business models. That’s usually more significant.

Phillip Rainey QC "Are we at risk of ossifying the housing market with new property on 999-year leases"

— Peter Wells (@peterkwells) December 14, 2016

We need to support those organisations to change their business models; move to a future where we have data infrastructure that is as open as possible while respecting privacy; and help meet society’s 21st century needs. That might mean they also need to help open up data held outside government.

In closing I’d ask both the members of the APPG and all of the leasehold experts in the room to think about the power of the web, what people expect in the modern age and how the tools and techniques of the web and data can help build a better housing market. One that can reduce the number of cases like those that Patrick Collinson has written about over the last few months.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — -

After the various speeches questions were asked by people in the room. The questions were from a more diverse group of people than the the all-male panel (grr!).

I was asked whether there was enough data available for someone in Ellesmere Port to get a reasonable view on whether their leasehold flat will be worthless in 10 years time. I’m checking that today.

Someone else raised the issue of freehold management companies surprising people with unnecessary administration fees — for example £250 for a simple bit of paperwork that is necessary if the homeowner wants to sell their home. That’s an issue my wife and I are well aware of having just sold our leasehold flat in London. We plan to blog on how data helped and where some data was missing.

Someone else asked whether we knew if the problem with leaseholds was bigger than in the 1970s. The answer from the panel was a bit vague but Phillip Rainey raised an important point. He said that the problem was getting worse because lawyers were producing new tighter leasehold clauses that benefitted the freeholder. He said that lawyers used the web to share these new clauses so they were all getting better in a way that made the situation worse for leaseholders.

You see technology can be used for good and bad and — as a very wise person once said — knowledge is power.

To help level out power imbalances we need to share the knowledge and the skills to use it with everyone.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — -

After these questions the event was closed by Peter Bottomley who discussed next week’s leasehold reform debate in Parliament and how he intends to name names.

{Update 22 December: the Hansard transcript of the debate is now up}

Ways to get bus timetable data to people who need it

Some of the reactions I get to my example

Why I’m writing this down now

Data needs context

Some of the things people use this swear word data for

This data was created because of democracy

Similar data is maintained in contexts outside of TV and radio

The UK has a list of suppressed car registration plates

Context affects what words are offensive

Collaborative maintenance of data

Offensiveness is affected by time, place and communities

People create new offensive words

Swear words-as-a-service is hard

Sorry mum

The questions from the audience showed the importance of context

Watch the original talk or read the slides

What is bike sharing

Some of bike sharing’s challenges can be helped by better use of data

Learning the lessons

The challenge of bad data business models

Data is not oil but it is infrastructure

This isn’t easy, it hurts

I boycott because of what the Oystons have done

Thousands of Blackpool fans are trying to make things better

You can help Blackpool football club

Roman roads were for the economy as well as the military

Data is not really roads

Roman roads helped co-opt other economies

What have the Romans ever done for us

We learnt that roads needed to be run as a system

This cat data deserves data protection

Ownership of cat data is complex

Controlling cat data

Sometimes you need to keep things simple

Data is complex, and that’s ok

Choices about data

Data is political

Words already carry political meaning

Data is not about left or right wing politics

Take back control of data

The political debate about data is happening now

About me

Archives

Copyright notice