Tuesday 8 September 2020

Emerging models of data governance in the age of datafication

by Marina Micheli, Marisa Ponti, Max Craglia and Anna Berti Suman 

Big Data & Society 7(2), https://doi.org/10.1177/2053951720948087. First published: Sept 1, 2020.

The article synthetizes and critically inquires a ‘moving target’: the various practices that are being advanced for the governance of personal data. In the last years, following scandals like Cambridge Analytica and new regulations for the protection of data like the GDPR, there is mounting attention on how data collected by big tech corporations and business entities might be accessed, controlled, and used by other societal actors. Scholars, practitioners and policy makers have been exploring the opportunities of agency for data subjects, as well as the alternative data regimes that could allow public bodies to use such data for their public interest mission. Yet, the current circumstances, which are the result of a tradition of ‘corporate self-regulation’ in the digital domain and an overall laissez-faire approach (albeit increasingly divergent by geopolitical context), see the hegemonic position of a few technology corporations that have de-facto established ‘quasi-data monopolies’. This is reflected in the asymmetry of power between data corporations, which hold most of the decision-making power over data access and use, and the other stakeholders.

The article increases knowledge about the practices for data governance that are currently developed by various societal actors beyond ‘big tech’. It does so describing four data governance models, emphasizing the power of social actors to control how data is accessed and used to produce different kinds of value. A relevant outcome of the article lies in the heuristic tools it proposes that could be useful to better understand and further examine the emerging models of data governance –looking in particular at the relations between stakeholders and the power (un)balances between them.

The idea for this study originates from a workshop that we organised in the context of the project Digitranscope at the Centre of Advanced Studies of the Joint Research Centre of the European Commission. Seventeen invited experts - from academia, public sector, policymaking, research and consultancy firms - took part at the event, back in October 2018, to discuss the policy implications of the governance of (and with) data. While preparing the workshop, we realised how the various labels that circulated in the policy arena to tackle data governance - such as data sovereignty, data commons, data trusts, etc. - tended to be used equivocally to refer to different concepts (technical solutions, legal frameworks, economic partnerships, etc.), with their meaning slightly shifting according to the context. Furthermore, during the workshop, participants highlighted the widespread lack of knowledge and practical understanding of possible alternatives to the ‘data extraction’ approach of big online platforms, as well as the need to find ways to use data collected by private companies for the public interest, and the urgency to consider data subjects as key stakeholders for the governance of data. With all these insights in mind, we decided to engage in the research that lead to this article.

The key contributions of this publication, according to our view, are conceptual and empirical.

  • We developed a ‘social-science informed’ definition of data governance that draws from science and technology studies and critical data studies (hence, also from some key publications of this journal). We understood data governance as the power relations between all the actors affected by, or having an effect on, the way data is accessed, controlled, shared and used, the various socio-technical arrangements set in place to generate value from data, and how value is redistributed between actors. Such definition allows moving beyond concerns of technical feasibility, efficiency discourses and ‘solutionist’ thinking. Instead, it points to the actual goals for which data is managed, emphasizing who benefits from it, the power un(balances) among stakeholders, the kind of value produced, and the mechanisms (including underling principles and system of thoughts) that sustain this approaches. 

  • We conducted a review of relevant resources from the scientific and grey literature on the practices of data governance that lead to the identification of four emerging models: data sharing pools, data cooperatives, public data trusts and personal data sovereignty. As this is a rapidly evolving field, we did not aim at offering an exhaustive picture of all possible models - hence these four should not be understood as comprehensive. They also have to be contextualised in our conceptual approach, in the time span in which the research has been conducted and in the European focus taken by the article. Yet, they provide a basis to understand how the emerging data governance models are (re)thinking and redressing power asymmetries between big data platforms and other actors. In particular, they show how both civic society and public bodies are key actors for democratising data governance and redistributing value produced through data.
A social science-informed conceptualisation of data governance allows seeing ‘through the infrastructure’ and encourages asking certain questions, such as: what principles guide data sharing and use? What is done with data and who can access and participate in its governance? What value is produced and how it is redistributed? This kind of questions is particularly relevant today, given that the policy debate around data governance is very active at the moment (especially in Europe). The future of the data governance models examined in this article – and of any model that allows more actors to control data and use it for purposes beyond the generation of profit for big tech corporations – depends on the policy actions and the legal frameworks that will be developed to sustain them.

Vignette of the data governance models examined in the article.

Keywords: Data governance, Big Data, digital platforms, data infrastructure, data politics, data policy