In recent years the UK government has got into the habit of announcing that it has employed cats. Downing Street, the Foreign Office and the Treasury all have cats whilst the Cabinet Office are about to appoint one. An unusual habit for a government but, I suppose, life should be full of strangeness.
One afternoon I was feeling simultaneously bored and whimsical, a risky combination, so I spent 10 minutes building a UK gov cat register — a list of these cats — which I published on the web.
This week I created a dashboard for the cat register. That should have been relatively simple too but it took a little longer. Some of my skills are a bit rusty.
A list of cats that work for the UK government might seem like a silly joke – it was 🙂 – but it also gave me a chance to use, and give feedback on, some new tools developed by the Open Data Institute (ODI)’s Labs team.
Here’s what I did. It might help others publish some open data or build a dashboard. If you read it all you’ll also learn who Schrödinger’s gov cats are…
How I built cat register
I started off by pulling together some of the available data: names; the department the cats worked in; the dates when they started (or ended) their work; and social media accounts. Yes, UK government cats have social media accounts: both official and unofficial. The data was gathered into a spreadsheet application and saved as a CSV file.
I will shamefully admit that I did not think too much about the needs of potential users of the data. After all, this was a whimsical experiment which users would be able to help maintain if they wanted to be whimsical too. I also concluded that privacy would not be an issue as animals do not have rights under the General Data Protection Regulation. In less whimsical circumstances I would recommend completing a privacy assessment before publishing a dataset.
I used the ODI Labs’ Octopub tool to publish the CSV file. Octopub automatically creates an open data certificate and uses Github to store and publish the data with all of the functionality that provides.
After that step the data was accessible on the web, openly licensed to make it clear that people can use it and was open for collaboration so that people could help improve it. Do use the cat data, read how to submit some extra data or raise an issue if you want to.
This bit was easy. A dashboard was a little harder.
A minimum viable cat dashboard
To help with metrics and dashboards the Labs team have created Bothan: it brings you information in the form of a free platform for storing and publishing metrics as JSON or simple visualisations. This capability is built on top of another web tool, Heroku, that allows new applications to be quickly deployed to the web.
Setting up a Bothan instance and reconfiguring an existing dashboard was relatively easy but automating the process of getting data, like the total number of cats, from the register into Bothan proved harder.
The team recommended Zapier, a web tool designed to help automate workflows. It’s less open than the other tools — I couldn’t easily share my config and the pricing plan seemed to scale fast — but it looked like it would do the job and help get even more cats on the web. The team have even integrated Bothan with Zapier to make it easy. Unfortunately I had to get to grips with the Python scripting language and my last foray into similar stuff was a while ago. Luckily there was help both on the web and in the office.
After getting the tech working I shared a couple of early drafts on twitter; got some feedback (at which point I learnt that Google had given me the wrong answer for the total number of cats in the UK (if only searching for data was as easy as searching for documents) and improved it to a point that I was happy to call it a minimum viable dashboard.
There is one bit of configuration and code looking for changes to the cat register and calculating new metrics for those values; whilst another bit is looking for changes to some official UK government data about cats. Everything runs automatically.
You will find a bit more detail and the code for the dashboard on Github. Feel free to suggest new features.
You might have noticed that the dashboard has an entry for “Schrödinger’s cats”. The reason for that is quite simple, just like the cat in Schrödinger’s famous experiment I could find no data that confirms whether some cats are alive or dead. I could make an educated assumption, after all one cat started duty in 1964…, but I thought it was worth leaving the status unclear. I simply left them marked“Inactive” and imagined the life of a retired UK government cat.
Anyone who uses the data can make their own assumption about those cats whilst leaving it unclear might incentivise someone to help find the missing data and, perhaps, discover that an elderly cat from the swinging 60’s is still patrolling the corridors and clubs of Whitehall.
That incentivisation is interesting. A good register should, like any data infrastructure, be providing a foundation on which people can build services and find insights but a good dashboard should be incentivising behaviour in line with a particular goal or strategy. My goal was to get even more cats on the web. The register and dashboard was a way of getting other people to help me. Submit more cats.
Publish your own data or build your own dashboard
But enough of cats, for now. My whimsy also helped me explore a little bit of data publishing. Octopub, Bothan, Zapier and Python all turned out to be fairly easy to use so, if you fancy giving open data a go, why don’t you publish your own dataset or create your own dashboard?
You could start with a whimsical project (penguin register anyone?) or perhaps something more useful like this list of data science courses in Europe prepared as part of the ODI learning team’s work for the European Data Science Academy.
If the documentation for each of those tools doesn’t help you with a problem then there are plenty of people around to ask and, once you’ve learnt the answer, you can always suggest ways to improve the documentation and help the next person.
— — -
Update 21 April: since writing this blogpost I have done a bit more work on cat data, privacy and complexity.