Wednesday, 4 April 2018

The Delusion of Personal Data

The focus on the billions wiped off Facebook's share price recently, following the revelations around the Cambridge Analytica "data breach", has had the unfortunate effect of reinforcing the popular belief that personal data is a valuable asset. Given that the share price has simply fallen back to where it was 10 months ago, and that Facebook's user growth has been slowing of late (and has even declined in the US), an alternative interpretation is that the market is adjusting by better pricing-in the risk of future user desertion. As its mature user population is ageing and its user growth is now biased towards less valuable regions of the globe, I would have expected this adjustment to happen anyway. In other words, what matters for Facebook's future earnings potential is still its user demographics and the extent of engagement (as that is what matters to advertisers) rather than the quantity or quality of the data that it has managed to collect. While tech-boosters and politicians have been swept up in the promise of Big Data and AI, the data-centric model has yet to really impinge on a stock market that has largely stuck with the user-centric model of the 1990s when it comes to valuing the platform capitalists.

In the early days of the commercial Web, when monetisation was still a conundrum, micropayments were seen by many as the future. You would be charged a fraction of a penny for each page you viewed, the assumption being that you would be discerning in your consumption (the predictions were made long before increased bandwidth and smartphones enabled a tidal wave of promiscuous browsing). Back in 1998, Jakob Neilsen justified micropayments by reference to the equivalent cost of labour. His argument was that the charge would be negligible compared to the value of the user's time, even if the latter was heavily discounted for non-working hours. In the event, micropayments never took off as a way of compensating content-providers, largely because the low (and falling) cost of production meant that there would always be a surfeit of amateurs producing free stuff (the "fame vs fortune" problem noted by Clay Shirky and others). Niche providers, from porn to business intelligence, have had limited success with subscription paywalls, but this clearly was never going to be a generally-applicable model. What was significant about Nielsen's thinking was the suggestive association with time, which helped shift the focus away from "King Content" to the contribution of the user and thus the production of personal data.

As it became clear that what was actually driving the Internet economy was user data and its leverage for advertising, the conceptualisation of personal data as an asset came to the fore. This produced two quite different interpretations. The dominant one is propertarian, or democratic capitalist. In this scheme, each individual would retain ownership of their data and receive a dividend from advertisers for its use. The sums involved would be laughably small at the individual level (though some have seen it as another nudge towards a basic income), but would ensure that the value of the Internet was not monopolised by companies such as Facebook and Alphabet, while the behaviour of abusive advertisers could be kept in check by the discipline of a property-owning democracy. Jaron Lanier is perhaps the leading proponent of this approach, which he outlined in his book, Who Owns the Future? As a number of critics have pointed out, the democratic element of this is weak (essentially it means forgoing payment in order to boycott certain advertisers), while the operation of the necessary infrastructure (a micropayments system orders of magnitude more complex than Neilsen's model) would place more power in the hands of unelected technologists and would be a dream come true for the surveillance state.

The second conceptual approach is a social democratic one in which personal data is aggregated and held in trust, with commercial users charged for access to it and pro-social organisations given discounted or even free access if they can show public benefit. This would be simpler to implement in terms of the financial transactions, though it would entail a significant bureaucracy for licensing and wouldn't easily accommodate novel or unconventional uses of the data. Evgeny Morozov is one of its leading advocates in the media. This approach prefers to see society's data as a collective, natural resource (which chimes with the "data is the new oil" trope) hence the status quo is described negatively as "extractivism", triggering echoes of anticolonialism as much as nationalisation in the proposed solution. If Lanier is guilty of reimagining the autonomous homesteader of Locke, Morozov is guilty of reimagining the sovereign of Hobbes, albeit in the context of a "truly decentralised, emancipatory politics, whereby the institutions of the state (from the national to the municipal level) will be deployed to recognise, create, and foster the creation of social rights to data". The bureaucratic risks of this are as obvious as the surveillance concerns, not least in the inevitable tension that will arise between centralisation and decentralisation.

Where Morozov does have a point is in his criticisms of Lanier and other propertarians for their attempts to subdivide a social value that is meaningless when disaggregated: "one cannot simply take the total revenue of these companies and divide it by the number of individual users to figure out what each of us is due". He also recognises that the production of personal data already depends on publicly-funded infrastructure, not just in the historic sense of the role of schools and hospitals in social reproduction, but in the more immediate and growing role of the state in providing the facilities for the individual creation and capture of data: "a lot of the data that we generate, when we walk down a tax-funded city street equipped with tax-funded smart street lights, is perhaps better conceptualised as data to which we might have social and collective use rights as citizens, but not necessarily individual ownership rights as producers or consumers". For all that, Morozov suffers from the same fundamental belief as Lanier, namely that personal data is a valuable resource (essentially surplus labour) that is being alienated from its producers. But this is a form of vulgar Marxism, not to mention a banal use of the term "exploitation", that confuses what we are obliged to do by necessity with what we choose to do for entertainment or self-actualisation.

The idea that personal data is an asset, and therefore the chief question concerns its ownership, is little advance on the older idea that we users are commodities: "If you're not paying for it, you're the product". The problem is that both of these ideas are suffused with liberal ideology, imagining personal data as an extension of the person and therefore of the highest value. In fact, at the level of the individual, personal data is commercially worthless. I should stress that by "personal data" here I mean attributes and preferences (who I claim to be, what I claim to like). Transactional data (what I actually bought) and network data (who I actually know) are obviously valuable, to both private and state actors, but they are data relationships that I cannot exclusively own, so they are not properly "personal". Personal data is also worthless for many public benefit applications because the individual tells us little about the public at large. It is only in aggregate that data has value, essentially because the randomness of personality is lost in the crowd. However, even this observation can lead us astray if we then assume that value is cumulative: the more data we have, the more valuable the dataset becomes. In reality, value derives from the processability of the data. The promise of the singularity is not that ever more data will provide the raw material for a superior AI but that such an intelligence may eventually transcend the need for more data.

Processability means that the capital value of data is bound up in the structures and logic that translate it into a usable resource: data + intellectual property = capital. When we conceptualise data as an asset, either at the individual or collective level, we elide questions about the creation and ownership of that IP. Platform capitalists are happy to cede a degree of user control over data because it is the IP that really matters. Despite the heroic inventor myth, Mark Zuckerberg is responsible for only a tiny fraction of Facebook's codebase, but he gets the lion's share of its profits. This is because he owns the largest share of the company's IP, which means that he owns the surplus labour arising from the cognitive production of thousands of Facebook (and WhatsApp) software engineers. He also has no intention of ceding control over the IP, any more than he intends to allow minority shareholders a meaningful say: "One of the things that I feel really lucky that we have is this company structure where, at the end of the day, it’s a controlled company. We are not at the whims of short-term shareholders. We can really design these products and decisions with what is going to be in the best interest of the community over time." Substitute "company" for "community" in that last sentence and this is straightforward paternalistic capitalism.

Contrary to the hopes of the liberal press, Facebook and the other platform capitalists are not too worried about attempts by the state to regulate the use of personal data. Most regulatory initiatives to date have taken the route of consumer protection, i.e. obliging data-handlers to observe certain proprieties, which has proven to be weak in practice (consider the difficulty of the UK Information Commissioner's Office in gaining a warrant to investigate Cambridge Analytica). More recently there has been a shift towards framing personal data in terms of property rights, particularly notable in the approach adopted by the EU in the General Data Protection Regulation, which comes into force this May. This will probably be more effective in regulating technology companies, but it won't pose a threat to the model of platform capitalism. Morozov's idea would be more of a threat, simply because it enables unavoidable taxation, though the vagueness of his ideas about how this would work in practice, not to mention the modest nature of contemporary European social democracy, suggest that a capital-friendly compromise is likely. Until we address the intellectual property rights of the platform capitalists, we're looking in the wrong place.


  1. Herbie Mounts the Pavement5 April 2018 at 18:07

    The advertisers along with the users have gone to the monopoly, because that’s where everyone else is.

    Facebook won that early evolutionary battle, with a mixture of luck, logos and algorithms, oh and free at the point of use (I predict charging will never happen unless they really do want to sink without trace; after all we pay our ISP for the privilege of viewing pages). It could have been another platform but it turned out to be facebook. The point is that this was always a natural monopoly where one platform would come to dominate over everything else. Some call it the herd mentality but from another view it is just sensible to have one point of access. Why upload pictures to 20 sites when you can do it to one! Why have to remember 20 logins when you can have one etc etc

    No one I claim gives a flying fuck that facebook use their data and sell it for money, they know this ready. Everyone freely sharing their information with each other beyond any boundaries is a great vision on first sight. The problem comes when this is done in a society that is hierarchical and deeply exploitative, and where wealth determines life outcomes. There is something perverse about this sort of freedom in a world with little of it. I am reminded of the Palestinian response to the sickening ice bucket challenge, they came up with the wonderful rubble bucket challenge where they threw rubble from their destroyed houses over their heads, pointing out that water was too valuable to be wasted in such a frivolous way. So facebook comes with all sorts of grotesque after affects. Imagine someone posting a picture of their new expensive diamond ring while the person that mines that diamond, if they can even get internet access, posts a picture of his sick child who has no access to proper health care.

    The true spirit of freedom and democracy is of course where there is no incentive to advertise, i.e. to present a one sided view of whatever product is being flogged. In fact true freedom and democracy would mean people have already determined consumption before desire is dangled in front of their faces. In other words facebook exists simply to provide users for a place to post no questions asked and with nothing done with the data other than maybe for the planning department and the administrative section to use for whatever reasons. From a Marxist point of view the state withers away (in this case advertisers and government oversight) and is replaced by public bodies using the data for the administration of things. In this world of true freedom and democracy there is no fake news because there is no incentive to provide fake news and actually there are no political parties or positions because those positions are determined by a random process and there is no incentive for political theatre.

    But let the facebookers beware, if you go on facebook expect abuse because most of the time you deserve it! I would have this on the banner of facebook!

  2. David, what you call 'network data' (who you met, what you bought) is personal data. So saying personal data isn't valuable, then saying network data is, seems like an empty point.

    1. I'm actually drawing a clear distinction between the two. As I said above: "I should stress that by 'personal data' here I mean attributes and preferences (who I claim to be, what I claim to like). Transactional data (what I actually bought) and network data (who I actually know) are obviously valuable, to both private and state actors, but they are data relationships that I cannot exclusively own, so they are not properly 'personal'".

      The post is about the dubious idea that there is a class of data that is your exclusive property and which can therefore be treated as a valuable asset. This idea is a continuation of the abstraction of bourgeois identity (or "reputation") as a form of property, which originated in the Early Modern period ("he that filches from me my good name" etc) and which would come to be formalised in the libel laws.

      The common definition of personal data in contemporary law (e.g. the Data Protection Act) assumes that any reference to a named person (or one whose name can be inferred) is personal, but the "personal data" we see in media discourse on the Internet clearly refers to a subset of this, namely an intrinsic, non-alienable identity that is at risk of exploitation.

      The paradox is that the data that actually has value, in that it can be monetised, is the data that describes your social relationships, not the data that describes you. For example, what you have bought or who you know. The former can be used to infer purchasing preferences, the latter to infer sympathies. Both are valuable to advertisers. The point is that you cannot exclusively own this valuable data (you are only a foreign key on a transation record, to put it in technical terms), so you cannot treat it as an asset. The actual asset is your intrinsic identity, but that is near worthless in isolation.

      What I'm getting at here is that liberals are barking up the wrong tree by couching the debate on data exploitation in terms of the rights of the individual, and thus perpetuating the bourgeois propertarian paradigm. The data is actually social. To that extent, I sympathise more with Morozov than Lanier, though I recognise that he is also playing the same liberal chords by valorising the individual, though presumably as a pragmatic tactic to get a hearing in the liberal press.

      My cyncial view is that they are barking up the wrong tree not just because they abhor a socialist interpretation of the data but because they would be uncomfortable moving from the familiar territory of personal property to questions about intellectual property, which is really where the action is at as far as platform capitalism is concerned.