Skip to main content

Blog

Some Challenges of ‘Keeping Personal Data Personal’

Posted on    by Andrew McHugh
Blog

Some Challenges of ‘Keeping Personal Data Personal’

By 10 March 2016No Comments

priceDominic Price, Horizon Digital Economy Research Fellow at the University of Nottingham describes some of the difficulties faced by his team in their attempts to develop systems that empower users to control their own data.


One of the key themes in Horizon, since it started 5+ years ago, has been ‘keeping personal data personal’. What we’ve tended to mean by this is that an individual should retain all the rights to the digital data that they produce (social media content, data from smart meters in the home, data from activity loggers, and so on) and that the individual should be the ultimate gatekeeper of access to that data. This simple idea is a reversal of the way that most current service providers implement their systems, the usual method is that user data is uploaded to the service providers servers and the service provider then maintains and controls access to that data.

In order for us to experiment with the concept of data being owned and controlled by the user, we have over the years built (to greater and lesser degrees of success) systems that attempt to interface with service providers systems and extract users’ data. Most commonly, the way to achieve this is through the use of a service provider’s application programming interface (API) and the most common way of authenticating with these APIs is through the OAuth protocol. I’m not going to go into the pros and cons of OAuth, speaking as an application developer it’s pretty easy to use and well supported, the problem that we have is that OAuth doesn’t quite fit the needs of our use cases.

OAuth is described as: “OAuth is a simple way to publish and interact with protected data. It’s also a safer and more secure way for people to give you access. We’ve kept it simple to save you time.” (http://oauth.net/). What this means is that OAuth is intended as a 3 party protocol in which the User can give access to their data held by a Service Provider to a 3rd Party. It’s a bit of a simplification but in general this is achieved by API requests being signed by both the user and the 3rd party to guarantee that the request has been authorised by both parties. The issue for us is that we want to remove the 3rd party from that process, we want to give users a way to access their own data in a complete private way. If we act as the 3rd party in the OAuth process, it gives us some level of access to a user’s data, access that we don’t want.

There are ways around this but they are less than perfect. One way is to distribute the 3rd party access tokens with an application, this means that the application will just work. The downside is that those access tokens are then open to abuse and will likely be cancelled by the service provider. Another way is to get the user to register their own OAuth application with a service provider, a confusing task for the average user though (some, like the Facebook app registration are confusing enough for developers!). Some providers, like Github and Google, do provide ways of generating API keys quickly and easily but these of course only work with those services.

It’s an annoying problem, and one that I see being asked quite a lot in various developer problems, because it can only really be changed by the service providers modifying their systems.