Wednesday 29 November 2023

Bookcast: Luke Munn interviewed by Andrew Dougall

In this bookcast, Andrew Dougall interviews Luke Munn, Research Fellow in Digital Cultures & Societies at the University of Queensland about his recent book 'Technical Territories: Data, Subjects, and Spaces in Infrastructural Asia' (2023). 

Tuesday 31 October 2023

by Alexander Campolo and Katia Schwerzmann  

Campolo, A., & Schwerzmann, K. (2023). From rules to examples: Machine learning’s type of authority. Big Data & Society10(2).  

The ethics of examples in machine learning 

Our article investigates a perceived transition from a rules-based programming paradigm in computing to one in which machine learning systems are said to learn from examples. Instead of specifying computational rules in a formal programming language, machine learning systems identify statistical structure in a dataset in order to accomplish tasks. There are many studies that show how data is constructed in various ways. We make a more specific argument: that in machine learning, data must be made exemplary—aggregated, formatted, and processed so that norms can emerge—to enable desired predictive or classificatory objectives.  

We are most interested in the ethical and even political ramifications of this transition. How does being governed by examples, by machine learning's specific type of predictions and classifications, differ from the rule of computational rules? How, concretely, is authority exercised by machine learning techniques? If you would like answers to these questions, please read our article! 

A larger issue is why speak in terms of rules and examples in the first place. We are aware that these themes may strike readers as unfamiliar in light of existing critical research on algorithms and artificial intelligence. Many studies have, with good reason, focused on the discriminatory or unequal effects of machine learning systems. Other, more conceptual work posits some external standard (consciousness, language, intelligence, neoliberalism etc.) and evaluates whether or not machine learning systems measure up to it, or whether they are mere "stochastic parrots," for example. We are sympathetic to both of these avenues, and they will continue to bear fruit. 

Our article, however, begins from a different position: not from "outside" of machine learning, but from within it. We were first struck by repeated references to "learning from examples" made by machine learning researchers themselves. This community even has an informal historical understanding in which an overemphasis on highly-specified formal rules led to failure in previous forms of AI, notably expert systems. Our first task, then, was to discern how examples work in machine learning. We argue that data is made exemplary, that is, capable of eliciting norms, through a set of technical practices that characterize machine learning, including labeling, feature engineering, and scaling. 

Merely characterizing how these terms are used within the machine learning community risks reproducing their views. After beginning our study up close or from within, we then examine these practices "from afar"—to identify their epistemological, ethical, and political implications. We theorize examples through historical-conceptual comparison, in contrast to rules and closely analyze several case-studies under the headings of labeling, feature engineering, and scaling. Our article draws from both classics, such as the work of Max Weber, and contemporaries, such as Lorraine Daston's very recent book  Rules: A Short History of What We Live By

This comparative approach situates machine learning within a constellation of concepts from social theory such as rationalization, calculation, and prediction. It connects machine learning to longer-running historical forces while also making its specific type of authority intelligible: how, precisely, do we use it to govern ourselves and others. Comparing rules and examples brought a number of other philosophical oppositions to light—specification and emergence, prompts and commands, the implicit and the explicit, the general and the particular, is and ought... These indicate further lines of research both for ourselves and hopefully our readers.

Thursday 14 September 2023

2023 Colloquium: Data Practices and Digital Social Worlds

Practices to collect, process, and communicate data have reconfigured sociological approaches to knowing and researching social life. Social researchers have analysed these reconfigurations as data-making practices, from the ethics of interacting with AI chatbots to the seductions of scraping social media data. The COVID-19 pandemic accelerated accessing ‘the field’ from afar, with digital social research carried out remotely. How are these data practices continuing to make and transform digital social worlds?

Organised by the journal Big Data & Society together with the Department of Sociology and the Planetary Praxis research group at the University of Cambridge, this colloquium brings together scholars from across disciplines to reflect on and speculate about digitally mediated data collection practices. The four-part colloquium will host dialogues about which data practices contribute to understanding digital social worlds. Participants will discuss their choice of methods, what they elicited and/or obfuscated, the unexpected challenges and unpredictable opportunities that surfaced in the process and what they would have done differently.


All sessions are scheduled for 16:00 to 18:00 (BST/GMT) / 11:00 to 13:00 (NY, EST). The sessions will not be recorded.

Session 1. Data Infrastructures & Labour

October 19th, 2023

Chairs: El No, Natalia Orrego

In this session, we focus on important yet often less visible types of data work involved in the production of technology. We explore various data-making practices, especially performed through/for platforms and infrastructures, and the politics in organising data work across multiple roles, from microworkers to machine-learning researchers.


  • Arturo Arriagada Ilabaca (Universidad Adolfo Ibáñez, Chile)
  • Dawn Nafus (Intel, US) 
  • Paola Tubaro (Centre National de la Recherche Scientifique, France)
  • Jing Zeng (Utrecht University, Netherlands, BD&S CE) 

Session 2. Data & Social Justice

October 26th, 2023

Chairs: Saide Mobayed & Anastassija Kostan

This panel will explore how data practices can either perpetuate or challenge systemic inequalities and how responsible data stewardship can be a powerful tool for promoting social justice. Topics include data feminism, data for algorithm accountability, indigenous data practices, and climate data justice.


  • Alejandro Mayoral-Baños (Indigenous Friends, Canada) 
  • Catherine D’Ignazio (Data + Feminism Lab, MIT, US)
  • Jocelyn Longdon, (University of Cambridge, UK)
  • Dan Calacci, The Workers' Algorithm Observatory (Princeton, US)

Session 3. Data Infrastructures & Cities

November 9th, 2023

Chairs: Michael McCanless, Jun Zhang

This panel will focus on how data makes cities legible. With particular attention to the various technologies and data flows that attempt to render urban life calculable, panellists work on mobility and property processes. 


  • Rachel Weber (University of Illinois, Chicago, US)
  • Julien Migozzi (University of Oxford, UK)
  • Erin McElroy (University of Washington, US)
  • Martin Tironi (Pontifical Catholic University of Chile)

Session 4. Data Citizenships & Governmentality 

November 21st, 2023
Chairs: Jun Zhang & Saide Mobayed

In this panel, we will discuss and critically reflect on the epistemologies of citizenship, digital relations, power dynamics, and governmentality in today's data-driven society. We will explore topics concerned with the politics of (big) data and the co-creation of social value enhanced (or not) by digital technologies. 

  • Evelyn Ruppert (Goldsmiths University, UK, BD&S Co-founder)
  • Ana Valdivia (University of Oxford, UK, BD&S CE)
  • Yu-Shan Tseng (Helsinki Institute of Urban and Regional Studies, Finland)
  • Dan Bouk (Colgate University, US)

Monday 5 June 2023

BD&S Journal will be on break from August 1 to September 4, 2024

 The editorial team of the journal Big Data & Society will be on break from August 1st to September 4th 2023.  

Please accept any delays in processing and reviewing your submission, and in related correspondence during that time. Thank you!

Monday 15 May 2023

2023 Call for Special Theme Proposals for Big Data & Society

 Call for Special Theme Proposals for Big Data & Society

The SAGE open access journal Big Data & Society (BD&S) is soliciting proposals for a Special Theme to be published in 2024/25. BD&S is a peer-reviewed, interdisciplinary, scholarly journal that publishes interdisciplinary social science research about the emerging field of Big Data practices and how they are reconfiguring relations, expertise, methods, concepts and knowledge across academic, social, cultural,  political, and economic realms. BD&S moves beyond usual notions of Big Data to engage with an emerging field of practices that is not defined by but generative of (sometimes) novel data qualities such as extensiveness, granularity, automation, and complex analytics including data linking and mining. The journal attends to digital content generated through online and offline practices, including social media, search engines, Internet of Things devices, and digital infrastructures across closed and open networks, from commercial and government transactions to digital archives, open government and crowd-sourced data. Rather than settling on a definition of Big Data, the Journal makes this an area of interdisciplinary inquiry and debate explored through multiple disciplines and themes.

Special Themes can consist of a combination of Original Research Articles (6 maximum, 10,000 words each), Commentaries (4 maximum, 3,000 words each) and one Editorial Introduction (3,000 words). All Special Theme content will have the Article Processing Charges waived. All submissions will go through the Journal’s standard peer review process.


Past special themes for the journal have included: Knowledge Production; Algorithms in Culture; Data Associations in Global Law and Policy; The Cloud, the Crowd, and the City; Veillance and Transparency; Practicing, Materializing and Contesting Environmental Data; Spatial Big Data; Critical Data Studies; Social Media & Society; Assumptions of Sociality; Data & Agency; Health Data Ecosystems; Algorithmic Normativities; Big Data and Surveillance; The Turn to AI in Governing Communication Online; The Personalization of Insurance; Heritage in a World of Big DataStudying the COVID-19 Infodemic at Scale; Digital Phenotyping; Machine Anthropology; Data, Power, and Racial Formation; Digital Phenotyping; Social Data Governance; The State of Google Critique and Intervention; Machine Anthropology; and Mapping the Micropolitics of Online Oppositional Subcultures.

See to access these special themes.


While open to submissions on any theme related to Big Data we particularly welcome proposals related to Big Data from the Global South / Global Majority; Indigenous data and data sovereignty; queer and trans data; and Big Data and racialization.

Format of Special Theme Proposals

Researchers interested in proposing a Special Theme should submit an outline with the following information.


  • An overview of the proposed theme, including how it relates to existing research and the aims and scope of the Journal, and the ways it seeks to expand critical scholarly research on Big Data.

  • A list of titles, abstracts, authors and brief biographies. For each, the type of submission (ORA, Commentary) should also be indicated. If the proposal is the result of a workshop or conference that should also be indicated.

  • Short Bios of the Guest Editors including affiliations and previous work in the field of Big Data studies. Links to homepages, Google Scholar profiles or CVs are welcome, although we don’t require CV submissions.

  • A proposed timing for submission to Manuscript Central. This should be in line with the timeline outlined below.


Information on the types of submissions published by the Journal and other guidelines is available at  .


Timeline for Proposals

Please submit proposals by August 15, 2023 to the Editor-in-Chief of the Journal, Prof. Matthew Zook at The Editorial Team of BD&S will review proposals and make a decision by October 2023. Manuscripts would be submitted to the journal (via manuscript central) by or before February 2024. For further information or discuss potential themes please contact Matthew Zook at


Monday 1 May 2023

Reflections on BD&S during the transition of Editors-In-Chief

In January 2023 the journal Big Data and Society transitioned the Editor-in-Chief from Evelyn Rupert (whose role is now Editor-in-Chief Emeritus and Founding Editor) to the former Managing Editor, Matthew Zook. Jennifer Gabrys has shifted from a co-editor to take on the job of Managing Editor as three new co-editors -- Rocco Bellanova, Ana Valdivia and Jing Zeng  -- have join the journal. Details on the full editorial team can be found here.

As part of this transition both Evelyn Rupert and Matthew Zook have written short reflections on the first nine years of the journal and thoughts about where it is going next. 


Evelyn Rupert: Looking Back on the First Nine Years of Big Data and Society

Since its launch in 2014, Big Data & Society (BD&S) has become a leading journal for interdisciplinary social science research on big data practices. It has been a privilege and honour to have founded and led the journal through its first ten years. As I step down from the Editor in Chief role, I take this opportunity to reflect on its beginnings and changes over the past decade, as well as consider future developments as the journal enters its second decade.

I started to develop a proposal for an interdisciplinary journal on big data in 2012. It was a daunting task as so little had been published about this emerging object in the social sciences. More attention was paid to developments in related phenomena such as the internet, computing and software, digital media and communications, and digital research methods. However, a few authors in the social sciences initiated critical analyses of big data, sometimes referred to as just a buzzword or the latest bandwagon. Much more was published in the humanities, computing and technology, and business. In this context, identifying potential editors, board members, authors, or reviewers was very difficult, especially for a launch issue. 

Perhaps more daunting was to specify the very object of the journal itself. ‘Big Data’ was vaguely defined and often criticised. It presented a potentially risky and controversial title for a journal. Rather than settling on a definition, we started with the following lead statement: ‘The Journal's key purpose is to provide a space for connecting debates about the emerging field of Big Data practices and how they are reconfiguring academic, social, industry, business and government relations, expertise, methods, concepts and knowledge.’ That is, we let Big Data be an object of debate (and capitalised the term to signal this), recognising it was and is shaped by myriad practices. What is ‘big’ about Big Data, according to BD&S, are the changing practices of data production, computation, analysis, circulation, implementation, proliferation, and involvement, and the consequences of these practices for how societies are represented (epistemologies), realised (ontologies) and governed (politics). Whether algorithms, AI, bots, or digital infrastructures, such practices engage with a variety of data and--contrary to claims of artificial intelligence--all practices are entangled with human agents, knowledge, power and influence. 

It is also worth noting that the journal was launched during a moment of major transformations in journal publishing, which involved a move to digital-only formats, open access and financing through Article Processing Charges (APCs). BD&S was founded on all three changes in publishing, each of which presented challenges and opportunities. Today, none of this is novel. Ten years ago, however, each change constituted important shifts in the field of academic publishing, with APCs especially introducing significant redistributive effects in the dissemination of knowledge. Rather than the subscription model, APCs are now the predominant business model in academic publishing, where access to funding has become critical to publish. While BD&S has been able to provide some APC waivers, the distributive consequences of this funding model require more critical analysis and possible intervention to ensure equity across career stages, location and discipline. 

Finally, I want to express my gratitude to all the people over the past ten years who joined the editorial team, including all the co-editors, editorial assistants, assistant editors and editorial board members, who are too many to mention. I am also grateful to the authors and innumerable reviewers, who ventured into relatively new territory and helped shape what the journal has become. A last word of thanks is to SAGE, for their confidence in my leadership and especially to Robert Rojek for his guidance and support over the years.

I leave the journal in good hands and I am impressed by the breadth and depth of the current Editorial Team. Passing the leadership of the journal on to Matt Zook (Editor-in-Chief) and Jennifer Gabrys (Managing Editor) fulfils an important principle of mine: periodically refreshing and changing roles is essential to enable the Journal to be shaped by different people and ideas. One thing is certain: Big Data practices are changing, advancing and, in some cases, becoming more pernicious. Critical interdisciplinary work is not only essential but also--as the contents of the journal demonstrate—proliferating as researchers address, challenge and transform the relations between Big Data and societies.  



Matthew Zook: Thoughts on the Success of BD&S and What Happens Next

I still remember my excitement when Evelyn Ruppert first contacted me about joining Big Data & Society (BD&S) as a co-editor in 2013. It was an energizing prospect made even more so by the chance to be part of a group of interdisciplinary social scientists grappling with the many forms and meanings of Big Data practices. Evelyn assembled  a team of scholars I wanted to read and talk with, and an invitation to be part of the editorial team was, in my mind, a front-row seat to the most exciting show in town.

Ten years later, I feel exactly the same. I am regularly astounded by the breadth, quality, and creativity of the articles we publish. They represent world-class scholarship and are agenda-setting in every sense of the word. A key part of this success has been Evelyn Rupert's development of the initial proposal and selection of the first round of co-editors that shaped the vision and voice of the journal. It is no overstatement that without her, BD&S simply would not exist, and for that, I am forever grateful. 

I am also deeply appreciative of the editorial group whose careful and hard work has been instrumental in making BD&S a success. The co-editors - both current and emeriti - have brought a wealth of disciplinary expertise (sociology, politics and law, science and technology studies, geography, journalism, computational social science, data science, planning and policy, media and communications) that they have successfully drawn upon to oversee the review and publication of a broad set of work. Our editorial assistants have done tremendous work behind the scenes to ensure BD&S stays on track, and our assistant editors promote all articles as they appear and oversee new initiatives such as the BD&S Colloquium series begun in 2022. Our editorial board and reviewers have provided vital input on papers that we rely upon to make our editorial decisions, and finally none of this would be possible without authors submitting their work. A heartfelt thank you to everyone. Without all of your hard work, the journal would not be what it is today.

Shifting from reflections on BD&S accomplishments to future plans, I see three principal tasks/challenges set before us. The first concerns expanding topics of inquiry, and we look forward to the exciting new topics, approaches, and theories that our authors bring in their papers. The heart of the journal remains focused on how Big Data interacts with social practices. However, this has continued to evolve in terms of the different deployments of Big Data (e.g., infrastructure, platforms, blockchain), applications (e.g., generative AI, health, identity, nature), and forms of governance (e.g., justice, securitized, privacy) to name but a few of the exciting topics currently under review. Second, there is the ongoing challenge of inclusivity of people and places in the papers we receive and publish. We seek to expand the journal's engagement across geography, practices, and theory, such as (but certainly not limited to) Big Data from the Global South, algorithmic justice, queer/trans data, and indigenous data regimes. Third, is the evolving meaning of Open Access publishing. Being an open access journal has been in the DNA of BD&S since it started, and I want to thank Robert Rojek and Sage for their willingness and ongoing support in making this happen. It has ensured that the good work that our authors write, and we review, gets to as large of a readership as possible. In particular, Sage's ongoing commitment to Research4Life and providing APC waivers has been essential to our ability to run exciting special themes and ensure that no article that we have accepted editorially has been lost due to APCs.

As I look back over my nine years with BD&S, five years as a co-editor and the last five as Managing Editor, I am amazed at the scope and scale of what the journal has done. Close to 1,200 authors have published 600+ articles both as stand-alone pieces and as part of 25+ special themes curated by guest editors. It is tremendously heartening to be part of such an intellectual community. As I step into the role of Editor-in-Chief I look to this community to continue to support and define the journal's work as we move forward. I am especially pleased to be working with Jennifer Gabrys (in her new role as Managing Editor), our co-editors: Rocca Bellanova, Dhiraj Murthy, Sung-Yueh Perng, Sachil Singh, Ana Valdivia, and Jing Zeng and the journal's Editorial Assistant, Natalia Orrego. 

I am grateful for my time with the journal and am looking forward to the years ahead. It is still a front row seat to the most exciting show in town.


Wednesday 11 January 2023

Ground Truth Tracings (GTT): On the Epistemic Limits of Machine Learning

by Edward B. Kang (@edwardbkang)

Kang, E. B. (2023). Ground truth tracings (GTT): On the epistemic limits of machine learning. Big Data & Society, 10(1).  

This article is a direct response to the increasing division I have been seeing between what might be called the “technical” and “sociotechnical” communities in artificial intelligence/machine learning (AI/ML). It started as a foray into the industry of machine “listening” with the purpose of examining to what extent practitioners engage with the complexity of voice in developing techniques for listening to and evaluating it. Through my interviews, however, I found that voice, along with many other qualitatively complex phenomena like “employee fit,” “emotion,” and “personality,” gets flattened in the context of machine learning. The piece thus starts with a specific scholarly interest in the interface of voice and machine learning, but ends with a broader commentary on the limitations of machine learning epistemologies as seen through machine listening systems.
Specifically, I develop an intentionally non-mathematical methodological schema called “Ground Truth Tracings” (GTT) to make explicit the ontological translations that reconfigure a qualitative phenomenon like voice into a usable quantitative reference AKA “ground-truthing.” Given that all machine learning systems require a referential database that serves as its ground truth – i.e., what is assumed to be true by the system – examining these assumptions are key to exploring the strengths, weaknesses, and beliefs embedded in AI/ML technologies. In one example, I bring attention to a voice analysis “employee-fit” prediction system that analyzes a potential candidate’s voice to predict whether the individual will be a good fit for a particular team. By using GTT, I qualitatively show why this system is not feasible as an ML use case and unlikely to be as robust as it is marketed to be.
Finally, I acknowledge that although this framework may serve as a useful tool for investigating claims around ML applicability, it does not immediately engage questions of subjectivity, stakes, and power. I thus further splinter this schema through these axes to develop a perhaps imperfect, but practical heuristic called the “Learnability-Stakes” table to assess and think about the epistemological and ethical soundness of machine learning systems, writ large. I’m hoping this piece will contribute to the fostering of interdisciplinary dialogue among the wide range of practitioners in the AI/ML community that includes not just computer scientists and ML engineers, but also social scientists, activists, journalists, policy makers, humanities scholars, and artists, broadly construed.