Wednesday 14 February 2024

Guest Blog: A Social Privacy Approach to Investigative Genetic Genealogy

 by Nina de Groot

de Groot, N. F. (2023). Commercial genetic information and criminal investigations: The case for social privacy. Big Data & Society, 10(2).

In 2022, four college students from the University of Idaho were stabbed to death. Law enforcement found DNA at the crime scene that probably originated from the murder suspect, but the DNA profile did not match with the FBI database of criminal offenders. The FBI then used, according to the prosecution, another method: searching in a public genetic genealogy database for relatives of the unknown suspect. Weeks later, they were, most likely based on this lead, able to arrest a suspect. By building extensive family trees of the unknown suspect and the DNA test consumer who is genetically related to the suspect, one can eventually zero in on the suspect by mapping genetic relationships and examining birth and death certificates or social media profiles. This so-called ‘investigative genetic genealogy’ is increasingly used in both the US and Europe.

Investigative genetic genealogy may have significant potential when it comes to crime-solving. Yet, the debate on investigative genetic genealogy tends to be reduced into a dichotomy between protecting privacy on the one hand and serving society by solving horrendous crimes, on the other. As the CEO of a commercial genetic genealogy database has said: “You have a right to privacy. You also have the right not to be murdered or raped”. However, reducing it to this dichotomy is problematic.

In my paper ‘Commercial genetic information and criminal investigations: The case for social privacy’, I propose to consider investigative genetic genealogy through the lens of social privacy. Social privacy is a helpful way to look beyond this simplified dichotomous choice. For example, it is not only about the privacy of the individual DNA-test consumer, but about the privacy of potentially thousands of genetic relatives. If one individual DNA test consumer gives consent to law enforcement use of their data, this person does so for many close and distant relatives.

Perhaps more importantly, a social privacy approach allows for the consideration of broader social- political concerns of investigative genetic genealogy. For example, it allows us to explore the complex issues when it comes to commercial parties entering the scene of criminal investigations, including dependence on commercial actors. Commercial companies currently have a major say in deciding in which crimes to allow (and not allow) law enforcement use of their database. Additionally, public forensic laboratories often do not have the technology to generate the type of DNA profile needed, which leads to the dependence of law enforcement on private actors in this process. Furthermore, investigative genetic genealogy could have problematic consequences for the relationship between citizens and state as well as for the public nature of criminal investigations.

A social privacy approach can also help shed a light on the implications for criminal investigations and the democratic process. Namely, if only a few percent of a population gives consent for law enforcement use of their DNA data, almost every individual within that population can be identified through distant relatives. Therefore, the decisions of only a fraction of the population can have far- reaching implications, reflecting a potentially harmful ‘tyranny of the minority’, which is a concept from digital data ethics. In that respect, insights from the social privacy debate on the interconnectedness of data for the online digital data debate, can be useful for exploring these issues in the context of investigative genetic genealogy.

Friday 5 January 2024

by Zelly Martin, Martin J. Riedl, Samuel C. Woolley

Martin, Z. C., Riedl, M. J., & Woolley, S. C. (2023). How pro- and anti-abortion activists use encrypted messaging apps in post-Roe America. Big Data & Society, 10(2).

How pro- and anti-abortion activists use encrypted messaging apps in post-Roe America

On June 24th, 2022, the United States Supreme Court ended nearly 50 years of federal protection of abortion. Immediately, we noticed that experts were recommending encrypted messaging apps as the solution to abortion-related data leakage for abortion-seekers, activists, and healthcare providers. Not two months later, a teenager and her mother in Nebraska were arrested for the teenager's abortion on the basis of data obtained from their Facebook Messenger conversations. It seemed, then, that encrypted messaging apps might provide security that unencrypted spaces (as Facebook Messenger, at the time, was) could not. 

We thus set out to explore the utility of encrypted messaging apps as privacy-promoting spaces in a post-Roe America. We interviewed pro-abortion, anti-abortion, and encryption activists in U.S. states with varying levels of abortion restriction or protection. We found that while our pro-abortion interviewees often considered encryption as a powerful tool for security, our anti-abortion interviewees largely rejected it on principle alone, believing it to be characteristic of inauthenticity or criminality, or simply found it untrustworthy. Yet activists on both sides of the abortion issue used encrypted messaging apps for reasons other than security, including convenience and coordination.

Ultimately, we argue that although end-to-end encryption is a powerful security tool, it must be used in combination with other security practices to effectively resist patriarchal (and ubiquitous) surveillance by corporations and law enforcement in a post-Roe America.  

Monday 11 December 2023

Guest Blog: Christopher Till on Spotify's datafication of user health

by Christopher Till

Till, C. (2023). Spotify as a technology for integrating health, exercise and wellness practices into financialised capitalism. Big Data & Society, 10(2).

Over the last few years I have noticed how Spotify, and similar music streaming services, have produced content and tailored services with the intention of helping us to improve our health and to exercise more. While looking into their innovations in this area (e.g. matching music tempo to running cadence, automatically generating playlists for particular activities) I began to wonder what business interests were driving this (aside from simply retaining subscribers). To explore this I analysed Spotify’s patent applications (to see what innovations they have in the works and how they plan to datafy and analyse user interactions), financial statements and industry interviews and press releases amongst other materials. I found that the failure of Spotify’s subscription service to turn a profit (as yet) or provide sufficient growth to please shareholders has led them to engage in increasing attempts to financialize the practices of their users. Health, exercise and wellness practices are presented as a particularly fruitful area of exploitation which Spotify can datafy in such a way to make them amenable to targeted advertising and therefore tell a story to investors about the growth and future profitability of the company. So, users’ everyday lives are increasingly seen as commercially useful sites to be mined for insights for targeted advertising and health, exercise and wellness practices are integrated by Spotify into the financialized networks of digital capitalism.

Wednesday 29 November 2023

Bookcast: Luke Munn interviewed by Andrew Dougall

In this bookcast, Andrew Dougall interviews Luke Munn, Research Fellow in Digital Cultures & Societies at the University of Queensland about his recent book 'Technical Territories: Data, Subjects, and Spaces in Infrastructural Asia' (2023). 

Tuesday 31 October 2023

by Alexander Campolo and Katia Schwerzmann  

Campolo, A., & Schwerzmann, K. (2023). From rules to examples: Machine learning’s type of authority. Big Data & Society10(2).  

The ethics of examples in machine learning 

Our article investigates a perceived transition from a rules-based programming paradigm in computing to one in which machine learning systems are said to learn from examples. Instead of specifying computational rules in a formal programming language, machine learning systems identify statistical structure in a dataset in order to accomplish tasks. There are many studies that show how data is constructed in various ways. We make a more specific argument: that in machine learning, data must be made exemplary—aggregated, formatted, and processed so that norms can emerge—to enable desired predictive or classificatory objectives.  

We are most interested in the ethical and even political ramifications of this transition. How does being governed by examples, by machine learning's specific type of predictions and classifications, differ from the rule of computational rules? How, concretely, is authority exercised by machine learning techniques? If you would like answers to these questions, please read our article! 

A larger issue is why speak in terms of rules and examples in the first place. We are aware that these themes may strike readers as unfamiliar in light of existing critical research on algorithms and artificial intelligence. Many studies have, with good reason, focused on the discriminatory or unequal effects of machine learning systems. Other, more conceptual work posits some external standard (consciousness, language, intelligence, neoliberalism etc.) and evaluates whether or not machine learning systems measure up to it, or whether they are mere "stochastic parrots," for example. We are sympathetic to both of these avenues, and they will continue to bear fruit. 

Our article, however, begins from a different position: not from "outside" of machine learning, but from within it. We were first struck by repeated references to "learning from examples" made by machine learning researchers themselves. This community even has an informal historical understanding in which an overemphasis on highly-specified formal rules led to failure in previous forms of AI, notably expert systems. Our first task, then, was to discern how examples work in machine learning. We argue that data is made exemplary, that is, capable of eliciting norms, through a set of technical practices that characterize machine learning, including labeling, feature engineering, and scaling. 

Merely characterizing how these terms are used within the machine learning community risks reproducing their views. After beginning our study up close or from within, we then examine these practices "from afar"—to identify their epistemological, ethical, and political implications. We theorize examples through historical-conceptual comparison, in contrast to rules and closely analyze several case-studies under the headings of labeling, feature engineering, and scaling. Our article draws from both classics, such as the work of Max Weber, and contemporaries, such as Lorraine Daston's very recent book  Rules: A Short History of What We Live By

This comparative approach situates machine learning within a constellation of concepts from social theory such as rationalization, calculation, and prediction. It connects machine learning to longer-running historical forces while also making its specific type of authority intelligible: how, precisely, do we use it to govern ourselves and others. Comparing rules and examples brought a number of other philosophical oppositions to light—specification and emergence, prompts and commands, the implicit and the explicit, the general and the particular, is and ought... These indicate further lines of research both for ourselves and hopefully our readers.

Thursday 14 September 2023

2023 Colloquium: Data Practices and Digital Social Worlds

Practices to collect, process, and communicate data have reconfigured sociological approaches to knowing and researching social life. Social researchers have analysed these reconfigurations as data-making practices, from the ethics of interacting with AI chatbots to the seductions of scraping social media data. The COVID-19 pandemic accelerated accessing ‘the field’ from afar, with digital social research carried out remotely. How are these data practices continuing to make and transform digital social worlds?

Organised by the journal Big Data & Society together with the Department of Sociology and the Planetary Praxis research group at the University of Cambridge, this colloquium brings together scholars from across disciplines to reflect on and speculate about digitally mediated data collection practices. The four-part colloquium will host dialogues about which data practices contribute to understanding digital social worlds. Participants will discuss their choice of methods, what they elicited and/or obfuscated, the unexpected challenges and unpredictable opportunities that surfaced in the process and what they would have done differently.


All sessions are scheduled for 16:00 to 18:00 (BST/GMT) / 11:00 to 13:00 (NY, EST). The sessions will not be recorded.

Session 1. Data Infrastructures & Labour

October 19th, 2023

Chairs: El No, Natalia Orrego

In this session, we focus on important yet often less visible types of data work involved in the production of technology. We explore various data-making practices, especially performed through/for platforms and infrastructures, and the politics in organising data work across multiple roles, from microworkers to machine-learning researchers.


  • Arturo Arriagada Ilabaca (Universidad Adolfo Ibáñez, Chile)
  • Dawn Nafus (Intel, US) 
  • Paola Tubaro (Centre National de la Recherche Scientifique, France)
  • Jing Zeng (Utrecht University, Netherlands, BD&S CE) 

Session 2. Data & Social Justice

October 26th, 2023

Chairs: Saide Mobayed & Anastassija Kostan

This panel will explore how data practices can either perpetuate or challenge systemic inequalities and how responsible data stewardship can be a powerful tool for promoting social justice. Topics include data feminism, data for algorithm accountability, indigenous data practices, and climate data justice.


  • Alejandro Mayoral-Baños (Indigenous Friends, Canada) 
  • Catherine D’Ignazio (Data + Feminism Lab, MIT, US)
  • Jocelyn Longdon, (University of Cambridge, UK)
  • Dan Calacci, The Workers' Algorithm Observatory (Princeton, US)

Session 3. Data Infrastructures & Cities

November 9th, 2023

Chairs: Michael McCanless, Jun Zhang

This panel will focus on how data makes cities legible. With particular attention to the various technologies and data flows that attempt to render urban life calculable, panellists work on mobility and property processes. 


  • Rachel Weber (University of Illinois, Chicago, US)
  • Julien Migozzi (University of Oxford, UK)
  • Erin McElroy (University of Washington, US)
  • Martin Tironi (Pontifical Catholic University of Chile)

Session 4. Data Citizenships & Governmentality 

November 21st, 2023
Chairs: Jun Zhang & Saide Mobayed

In this panel, we will discuss and critically reflect on the epistemologies of citizenship, digital relations, power dynamics, and governmentality in today's data-driven society. We will explore topics concerned with the politics of (big) data and the co-creation of social value enhanced (or not) by digital technologies. 

  • Evelyn Ruppert (Goldsmiths University, UK, BD&S Co-founder)
  • Ana Valdivia (University of Oxford, UK, BD&S CE)
  • Yu-Shan Tseng (Helsinki Institute of Urban and Regional Studies, Finland)
  • Dan Bouk (Colgate University, US)

Monday 5 June 2023

BD&S Journal will be on break from August 1 to September 4, 2024

 The editorial team of the journal Big Data & Society will be on break from August 1st to September 4th 2023.  

Please accept any delays in processing and reviewing your submission, and in related correspondence during that time. Thank you!