Thursday, 9 May 2019

Are we outsourcing the curation of history to Facebook?

Carl J Öhman, David Watson

Online death has recently become a hot topic both in academia and the popular press. However, much of the current debate around the phenomenon has focused on its significance for individual users. For example, most discussions in media pertain to planning one’s own digital estate and/or how best to cope with the digital remains of a loved one. (This has been reaffirmed by the innumerable interview questions we have received on these topics since the recent publication of our article.) In view of this, we wanted to bring in a more societal perspective and ask what people dying on the internet means for us on a collective level. We also wanted to highlight the fact that death on social media is not just a Western, high-tech phenomenon. The so-called “digital afterlife” is often associated with futuristic scenarios and AI, but the reality is that people all around the world pass away every day, leaving behind enormous volumes of data. Many of them are using no more sophisticated technologies than smart phones and social media apps.

From this background, we collected data from the UN and Facebook’s audience insights feature, from which we built a model that projects the future accumulation of profiles belonging to deceased Facebook users. Our analysis suggests that a minimum of 1.4 billion users will pass away between now and 2100 if Facebook ceases to attract new users as of 2018. If the network continues expanding at current rates, however, this number will exceed 4.9 billion. In both cases, a majority of the profiles will belong to non-Western users. In the former scenario, we find that the dead may outnumber the living on the network as soon as 2070.

In discussing our findings, we draw on the emerging scholarship on digital preservation and stress the challenges arising from curating the profiles of the deceased. We argue that an exclusively commercial approach to data preservation poses important ethical and political risks that demand urgent consideration. We want to be careful to state that our paper is not a critique of Facebook’s current policies on this matter. In fact, we would argue they’re actually doing a pretty good job, all things considered. We doubt that user death was high on Mark Zuckerberg’s list of priorities when he created the network, yet Facebook has devoted considerable resources to handling these sensitive matters in recent years. Hence, we would like to direct attention not so much to Facebook itself, but to the question of how we as a society, as a civilization, should go about dealing with the fact that Facebook will host billions of records of deceased users. Eventually these profiles will lose their commercial value – then what? Can we expect Facebook to keep hosting the data? Will it simply be deleted? Sold off? We need to build the proper institutions and infrastructure to tackle these questions now, because in only a few decades, these challenges will already be at our doorstep.

In particular, we wish to draw attention to the political aspect of our work. In George Orwell’s 1984, the past is controlled exclusively by the Party. They own all historical records and are not above modifying them to serve their own interests. The Party can do this because they have a monopoly on historical data. Although extreme, this scenario illustrates the risks involved in concentrating power over the past among a limited set of actors. And to some extent, that is exactly what we do today … only in our case, it is not a state or a party that controls that data, but a small number of tech empires. In pre-digital society, data about significant historical events and persons have generally been distributed across numerous institutions (national archives, museums, etc). Now, as political and social movements are largely mediated by online platforms, the narrative is increasingly owned by just a handful of firms. Today’s Martin Luther Kings, Winston Churchills, and Napoleons all probably use social media. Their lives and deeds are recorded in timeline posts and tweets. As researchers, we don’t want to be alarmist about this, but we do argue that there is good reason to be cautious about how we proceed. What kind of digital society do we want to build?

Wednesday, 1 May 2019

Data Politics: The Birth of Sensory Power

by Engin Isin and Evelyn Ruppert

Didier Bigo, Engin Isin, and Evelyn Ruppert recently published an edited collection, Data Politics: Worlds, Subjects, Rights (2019, Routledge). Building on a commentary first published in Big Data & Society, the book explores how data has acquired the capacity to reconfigure relations between states, subjects, and citizens. Fourteen chapters consider how data and politics are now inseparable as data is not only shaping social relations, preferences and life chances but our very democracies. Concerned with the things (infrastructures of servers, devices and cables) and language (code, programming, and algorithms) that make up cyberspace, the book argues that understanding the conditions of possibility of data is necessary in order to intervene in and shape data politics.

We concluded our chapter entitled ‘Data’s empire: postcolonial data politics’ with the suggestion  that Michel Foucault’s trilogy ‒ sovereign, disciplinary, and regulatory ‒ regimes of power is now joined by a fourth regime in the history of the present. We note that Foucault did not understand these regimes of power as supplanting but augmenting each other. That’s why he designated rather broad and shifting historical periods to identify their origins or birth: sovereign power roughly in the 16th and 18th centuries, disciplinary power in the 17th to 18th centuries, and regulatory (or biopower) in the 19th century.

The birth of regulatory power is of greatest interest to us as it relates to the development of knowledge about the species-body through the statistical sciences. Ian Hacking more precisely identified the 1820s and 1840s as the period when the idea of population was invented and statistical sciences were born as a regime of knowledge-power that regulated the relationship between the species-body (population) and individual body. Foucault broadly called this ‘biopolitics’ and inspired an important body of thought and work. His influence on the specific development of the history of statistics has been crucial and we have learned much from a pioneering body of subsequent scholarship.

Our starting point for the volume and the chapter is the need to place recent developments in data politics in relation to Foucault’s trilogy of regimes of knowledge-power. Gilles Deleuze already gestured toward this in his much-discussed ‘Postscript on the Societies of Control’ (1990) but it remained a suggestive if not early proposition.

We argue that to develop that proposition requires understanding the emergence of new data gathering, mining and analytic technologies. From web platforms, mobile phones, sensors, drones, satellites and wearables to devices that make up the Internet of Things, digital technologies and the data they generate are connected to the emergence of new regimes of knowledge-power especially during the last forty years. We provide a preliminary version of this proposition and conclude that perhaps the period between the 1980s and 2020s constitutes the birth of a new knowledge-power regime. We state that although we are confident about our claim, we are yet unable to name this regime.

With work we have done since writing the chapter we are now tempted to name the new knowledge-power regime as the birth of sensory power. The reasons for this are given in the chapter. We know this is an ambitious claim that will require further work on our part. But we hope it will also inspire readers to respond to both the chapter and our subsequent proposition that sensory power is a fourth regime in the history of the present.

Thursday, 28 February 2019

Weaving seams with data: Conceptualizing City APIs as elements of infrastructures

Weaving seams with data: Conceptualizing City APIs as elements of infrastructures by Christoph Raetzsch, Gabriel Pereira, Lasse S Vestergaard, and Martin Brynskov

Listen to the authors of this new article discussing how application programming interfaces (APIs) are weaving new seams of data into the urban fabric, and why they are important as elements of infrastructures.

Video Abstract

Text Abstract: This article addresses the role of application programming interfaces (APIs) for integrating data sources in the context of smart cities and communities. On top of the built infrastructures in cities, application programming interfaces allow to weave new kinds of seams from static and dynamic data sources into the urban fabric. Contributing to debates about “urban informatics” and the governance of urban information infrastructures, this article provides a technically informed and critically grounded approach to evaluating APIs as crucial but often overlooked elements within these infrastructures. The conceptualization of what we term City APIs is informed by three perspectives: In the first part, we review established criticisms of proprietary social media APIs and their crucial function in current web architectures. In the second part, we discuss how the design process of APIs defines conventions of data exchanges that also reflect negotiations between API producers and API consumers about affordances and mental models of the underlying computer systems involved. In the third part, we present recent urban data innovation initiatives, especially CitySDK and OrganiCity, to underline the centrality of API design and governance for new kinds of civic and commercial services developed within and for cities. By bridging the fields of criticism, design, and implementation, we argue that City APIs as elements of infrastructures reveal how urban renewal processes become crucial sites of socio-political contestation between data science, technological development, urban management, and civic participation.

Sunday, 17 February 2019

Jumping to exclusions? Why “data commons” need to pay more attention to exclusion (and why paying people for their data is a bad idea)

Barbara Prainsack

In January, a large German university proudly announced that Facebook had chosen them for a 6.5m Euro grant to establish a research centre on the ethics of artificial intelligence. The public response to this announcement took them by surprise: instead of applauding the university’s ability to attract big chunks of industry money, many were outraged about the university’s willingness to take money from a company known to spy and lie. If Facebook was so keen to support research on the ethics of artificial intelligence, people said, they should pay their taxes so that governments could fund more research focusing on these aspects.

Resistance against the growing power of large technology companies has left the ivory towers of critical scholarship and has reached public fora. The acronym GAFA, initially a composite of the names of some of the largest tech giants, Google, Amazon, Facebook, and Apple, has become shorthand for multinational technology companies who, as quasi-monopolists, cause a range of societal harms: They stifle innovation by buying up their competition, they evade and avoid tax, and (some more than others) threaten the privacy of their users. They have also, as Frank Pasquale argued, stopped to be market participants but become de facto market regulators. As I argued elsewhere, they have become an iLeviathan, the rulers of a new commonwealth where people trade freedom for utility. Unlike with Hobbes’ Leviathan, the utility that people obtain is no longer the protection of their life and their property, but the possibility to purchase or exchange services and goods faster and more conveniently, to communicate with others across the globe in real time, and in some instances, to be able to obtain services and goods at all.

As I argue in a new paper in Big Data & Society, responses to this power asymmetry can be largely grouped in two camps: On the one side are those that seek to increase individual-level control of data use. They believe that individuals should own their data, at least in a moral, but possibly also in a legal sense. Some go as far as proposing that, as an expression of the individual ownership of their data, individuals should be paid by corporations that use their data. For them, individual-level monetisation is the epitome of respecting individual data ownership.

On the other side are those who believe that enhancing individual-level control is insufficient to counteract power asymmetries, and that it can also create perverse effects: For example, paying individuals for their data would create even larger temptations for those who cannot pay for services or goods with money to pay with their privacy instead. From this perspective, individual-level monetisation of data would exacerbate the new social division between data givers and data takers. Instead, what is needed, they argue, is greater collective control and ownership of data.

In this second camp, which in my paper I call the “Collective Control” group (and to which I also count my own work), one solution that is being suggested is the creation of digital data commons. Drawing upon the work of scholars such as Elinor Ostrom and David Bollier, some scholars believe that data commons – understood as resources that are jointly owned and governed by people – are an important way to move digital data out of the enclosures of for-profit corporations and into the hands of citizens (in my paper, I discuss what this may look like in practice). A data commons, some of them argue, is a place where nobody is excluded from benefiting from the data that all of us had a share in creating.

But is this so? As I argue in this article, in much of the literature on physical commons – such as the grazing grounds and fisheries that Elinor Ostrom and other commons scholars analysed - the possibility to exclude people from commons is considered a necessary condition for commons to be governed effectively. When everybody has access to something and nobody can be excluded, it is likely that those who are already more powerful will be able to make the best use of the resource, often at the cost at those less privileged. For these reasons, Ostrom and others conceived commons not as governed by open access regimes – meaning that nobody holds property rights – but as ruled by a common property regime. Such a common property regime would allow the owners of the resource to decide how the resource can be used, and who can be excluded. In other words, to avoid inequitable use of commons, those governing the commons must be able to set the rules, and must be able to exclude.

The issue of who and how actors are or can be excluded from commons, however, has received very little systematic attention so far in the growing scholarship on digital data commons. In my article, I propose a systematic way to consider what types of exclusion from contributing data to the commons, from using, or benefitting from, the data commons, and from partaking in the governance of the commons are harmful, and how forms and practices of exclusion that cause undue harm can be avoided. In this manner, I argue, it is possible for us to distinguish between data commons that will help to counteract existing power imbalances and to increase data justice on the one hand, and those that use the commons rhetoric to serve particularistic and corporate interest on the other.

In this context, it is also apparent that either way, individual-level monetisation in the form of paying people for their data is a bad idea. Not only would it lure the cash-poor into selling their privacy, but it also plays into the hands of those whose who seek to individualise relationships between data givers and data takers to avoid a collective response to the increasing power asymmetries in the digital data economy.

Monday, 17 December 2018

Call for Special Theme Proposals for Big Data & Society

Call for Special Theme Proposals for Big Data & Society

The SAGE open access journal Big Data & Society (BD&S) is soliciting proposals for a Special Theme to be published in early 2020. BD&S is a peer-reviewed, interdisciplinary, scholarly journal that publishes research about the emerging field of Big Data practices and how they are reconfiguring academic, social, industry, business and government relations, expertise, methods, concepts and knowledge. BD&S moves beyond usual notions of Big Data and treats it as an emerging field of practices that is not defined by but generative of (sometimes) novel data qualities such as high volume and granularity and complex analytics such as data linking and mining. It thus attends to digital content generated through online and offline practices in social, commercial, scientific, and government domains. This includes, for instance, content generated on the Internet through social media and search engines but also that which is generated in closed networks (commercial or government transactions) and open networks such as digital archives, open government and crowd-sourced data.  Critically, rather than settling on a definition the Journal makes this an object of interdisciplinary inquiries and debates explored through studies of a variety of topics and themes.

Special Themes can consist of a combination of Original Research Articles (8000 words; maximum 6), Commentaries (3000 words; maximum 4) and Editorial (3000 words). All Special Theme content will be waived Article Processing Charges. All submissions will go through the Journal’s standard peer review process.

Past special themes for the journal have included: Knowledge Production, Algorithms in Culture, Data Associations in Global Law and Policy, The Cloud, the Crowd, and the City, Veillance and Transparency, Environmental Data, Spatial Big Data, Critical Data Studies, Social Media & Society, Health Data Ecosystems, Assumptions of Sociality and Data & Agency.  See to access these special themes.

Format of Special Theme Proposals
Researchers interested in proposing a Special Theme should submit an outline with the following information.

- An overview of the proposed theme, how it relates to existing research and the aims and scope of the Journal, and the ways it seeks to expand critical scholarly research on Big Data.

- A list of titles, abstracts, authors and brief biographies. For each, the type of submission (ORA, Commentary) should also be indicated. If the proposal is the result of a workshop or conference that should also be indicated.

- Short Bios of the Guest Editors including affiliations and previous work in the field of Big Data studies. Links to homepages, Google Scholar profiles or CVs are welcome, although we don’t require CV submissions.

- A proposed timing for submission to Manuscript Central.

Information on the types of submissions published by the Journal and other guidelines is available at .

Timeline for Proposals
Please submit proposals by Monday January 14, 2019 to the Managing Editor of the Journal, Prof. Matthew Zook at The Editorial Team of BD&S will review proposals and make a decision by mid- to late-January 2019. For further information or discuss potential themes please contact Matthew Zook at

Sunday, 16 December 2018

Illustrating Big Data discourses in the healthcare field

Marthe Stevens, Rik Wehrens and Antoinette de Bont

Over the last few years, there has been a growing critical scholarly discourse that reflects on how Big Data shape our knowledge and our understanding. Primarily the fields of Science and Technology Studies and Critical Data Studies have been instrumental in elaborating the neglected and problematic dimensions of Big Data. However, it is unclear how and to what extent such insights become embedded in the healthcare field.

At the same time, we notice that the healthcare field welcomes initiatives that aim to improve healthcare through Big Data. This development is interesting, as the healthcare field is characterized by a strongly institutionalized set of epistemological principles and generally accepted methodologies. The field favors, for example, high-quality evidence from randomized controlled trials and observational studies to guide treatment decisions. Big Data challenge these principles and methodologies as they promise faster and more representative knowledge on the basis of large-scale data analyses.

In our recent article in Big Data & Society, “Conceptualizations of Big Data and their epistemological claims: a discourse analysis”, we studied the various ways in which Big Data is conceptualized in the healthcare field and assess the consequences of these different conceptualizations. We constructed five ideal-typical discourses that all frame Big Data in specific ways and that use other metaphors to describe Big Data. Three of the discourses (the modernist, instrumentalist and pragmatist) frame Big Data in positive terms and disseminate a compelling rhetoric. Metaphors of capturing, illuminating and harnessing data presume that Big Data are benign and leading to valid knowledge. The scientist and critical-interpretive discourses question the objectivity and effectivity claims of Big Data. Their metaphors of selecting and constructing data illustrate another political message, framing Big Data as limited.

The modernist discourse: capturing data
Illustration by: Sue Doeksen

During our analysis, it became apparent that especially the critical-interpretive discourse has not broadly infiltrated the healthcare domain, despite the attention that is given to the problematic assumptions and epistemological difficulties of Big Data in fields such as Science and Technology Studies and Critical Data Studies. We argue that the healthcare field would benefit from a more prominent critical-interpretive discourse, as the other discourses do not address important reflections on the normativity and situatedness of Big Data as well as the social and political processes that create Big Data.

For the article, we worked together with an illustrator to visualize the discourses, as we believed that illustrations could help to deepen our and the reader’s understanding of the discourses. We contacted Sue Doeksen ( and she was very willing to help us and think along. What followed was an exciting process in which we and Sue both inspired each other. She wanted to have a clear message to present in a simple illustration. We had to make sure that the essence of the discourses was captured in the images.

This paper is part of a broader research project that focusses on the expectations and imaginaries associated with Big Data in healthcare. In the project, we conceptualize Big Data as a collection of practices and we aim to study what sorts of meaning it receives, is given to and how it changes practices. During the study, we specifically focus on the epistemological claims of Big Data.

About the authors:

Marthe Stevens is a PhD candidate at the department of Healthcare Governance at the Erasmus School of Health Policy and Management (Erasmus University Rotterdam, the Netherlands) and WTMC. She studies the use of Big Data and Artificial Intelligence in hospital settings in the Netherlands and in Europe. Her work focuses on the expectations and imaginaries associated with these new (data-driven) technologies. For more information see

Rik Wehrens is an assistant professor at the department of Healthcare Governance at the Erasmus School of Health Policy & Management. His (STS) research work focuses on issues of knowledge translation and ‘epistemological politics’, such as the coordination work between public health researchers and practitioners in negotiating the meaning of ‘practice-based health research’, and ‘valuation work’ in healthcare improvement programs. His current work explores the roles and expectations of Big Data in healthcare through ethnographic and discursive research ‘lenses’. As a part of the EU-funded project Big Medilytics (, he is involved in an international comparison of formal and informal rules for Big Data in various European countries.

Antoinette de Bont is an endowed professor at the Erasmus School of Health Policy and Management. Her research agenda addresses national and international policy priorities, like the diversification of the healthcare work or the use of Big Data to increase efficiency in healthcare. The research question that defines her agenda is: how do interdependencies between people and technology explain innovation in healthcare.

Thursday, 6 December 2018

Holiday break

The Big Data and Society Editorial Team will be on winter break from December 21st until January 7th. Please accept delays in processing and reviewing your submission during that time. Many thanks for your understanding.