Monday 13 May 2024

Call for Special Theme Proposals for Big Data & Society (Due August 15, 2024)

 Call for Special Theme Proposals for Big Data & Society

The SAGE open access journal Big Data & Society (BD&S) is soliciting proposals for a Special Theme to be published in 2025/26. BD&S is a peer-reviewed, interdisciplinary, scholarly journal that publishes interdisciplinary social science research about the emerging field of Big Data practices and how they are reconfiguring relations, expertise, methods, concepts and knowledge across academic, social, cultural,  political, and economic realms. BD&S moves beyond usual notions of Big Data to engage with an emerging field of practices that is not defined by but generative of (sometimes) novel data qualities such as extensiveness, granularity, automation, and complex analytics including data linking and mining. The journal attends to digital content generated through online and offline practices, including social media, search engines, Internet of Things devices, and digital infrastructures across closed and open networks, from commercial and government transactions to digital archives, open government and crowd-sourced data. Rather than settling on a definition of Big Data, the Journal makes this an area of interdisciplinary inquiry and debate explored through multiple disciplines and themes.

Special Themes can consist of a combination of Original Research Articles (6 maximum, up to 10,000 words each), Commentaries (4 maximum, 3,000 words each) and one Editorial Introduction (3,000 words). All Special Theme content will have the Article Processing Charges waived. All submissions will go through the Journal’s standard peer review process.

While open to submissions on any theme related to Big Data we particularly welcome proposals related to Big Data and the Global South / Global Majority; Indigenous data and data sovereignty; queer and trans data; and Big Data and racialization. You can find the full list of special themes published by BD&S at 

Format of Special Theme Proposals

Researchers interested in proposing a Special Theme should submit an outline with the following information.


  • An overview of the proposed theme, including how it relates to existing research and the aims and scope of the Journal, and the ways it seeks to expand critical scholarly research on Big Data.

  • A list of titles, abstracts, authors and brief biographies. For each, the type of submission (ORA, Commentary) should also be indicated. If the proposal is the result of a workshop or conference that should also be indicated.

  • Short bios of the Guest Editors including affiliations and previous work in the field of Big Data studies. Links to homepages, Google Scholar profiles or CVs are welcome, although we don’t require CV submissions.

  • A proposed timing for submission to Manuscript Central. This should be in line with the timeline outlined below.


Information on the types of submissions published by the Journal and other guidelines is available at  .


Timeline for Proposals

Submit proposals by August 15, 2024 via this online form  

(Note: you must have a Google account in order to access this form). Do not send proposals via email as they will not be reviewed. The Editorial Team of BD&S will review proposals and make a decision by October 2024. Manuscripts would be submitted to the journal (via manuscript central) by or before February 2025. 

For further information or discuss potential themes please contact Dr. Matthew Zook at


Tuesday 9 April 2024

Guest Blog: Role-Based Privacy Cynicism and Local Privacy Activism: How Data Stewards Navigate Privacy in Higher Education

by Mihaela Popescu, Lemi Baruh, and Samuel Sudhakar 

When was the last time you truly felt that adjusting your privacy settings on your most visited platform enhanced your safety? In today's digital age, especially in the United States, many users have come to accept that sacrificing privacy is an unavoidable consequence of engaging with digital technologies. This realization often breeds cynicism or apathy towards privacy, leading individuals to abandon efforts to safeguard their personal information.


This phenomenon, known by various names like privacy cynicism, privacy apathy or surveillance realism, encapsulates feelings of mistrust, powerlessness, and resignation that consumers commonly experience. While existing research focuses on data subjects' attitudes, our study presents a unique perspective – that of data workers who straddle the roles of both data subjects and data handlers in higher education settings. We aimed to explore the prevalence of privacy cynicism among these data workers and its potential impact on university data governance.


Projections indicate that the global market for big data analytics in education will exceed $50 billion by 2030. Within this landscape, university data professionals – including campus registrars, learning platform administrators, and information security officers – play a crucial role in safeguarding university data assets, albeit not always prioritizing the privacy of campus stakeholders. Our research, based on in-depth interviews with data professionals at California State University, unveiled significant findings:


1. Receptiveness to Datafication: Despite concerns about datafication trends, data professionals in higher education view its implementation as beneficial.

2. Tactics to Navigate Challenges: When faced with data misuse concerns, these professionals employ short-term "privacy activism" tactics to delay problematic uses.

3. Structural Changes vs. Short-Term Solutions: While effective in the short term, these tactics offer temporary fixes without fostering lasting structural changes.


Similar to consumer privacy cynicism, our interviews reflected a parallel sentiment among data professionals, particularly when organizational privacy definitions clashed with their personal beliefs. They grappled with powerlessness and disillusionment, exacerbated by the apathy shown by the very individuals they aim to protect.


A key insight from our study is the potential far-reaching consequences of this perception. A perceived lack of efficacy coupled with a perception that data subjects (namely, the students) don't care about privacy may lead to a spiral of resignation, reducing data professionals' motivation to advocate for enhanced privacy. This, in turn, limits data subjects' access to meaningful privacy options, further fueling their privacy apathy and cynicism.

Saturday 6 April 2024

Guest Blog: Situating Data Relations in the Datafied Home

by Gaia Amadori and Giovanna Mascheroni 

Situating data relations in the datafied home: A methodological approach. Big Data & Society, 11(1). 

As data relations, namely relations and communicative practices that are mediated, sustained, and shaped by the digital technologies that extract data, are pervading practices and imaginaries of parenting and childhood, the challenge of empirically studying datafication becomes particularly prominent in this context.

To address the epistemological and methodological challenges in the study of datafication from an everyday life perspective, we propose to focus on mediatized relations as a proxy for data relations. More specifically, drawing upon a non-media-centric figurational approach, we argue for the value of combining mixed method constructivist grounded theory methodology with network methods so as to materialise the relationships through, about and around data that emerge in contemporary family life. We do this by focusing on 3 households from a group of 20 with at least 1 child aged 8 years or younger in Italy, who participated in a qualitative longitudinal study on the datafication of childhood and family life.

The study aims to delineate an innovative methodological approach to highlighting the situatedness of data practices and imaginaries and developing new research tools to enhance the phenomenological richness of data practices in the diverse digital–material contexts of family life. In particular, we show how different family figurations translate into different patterns of mediatized relations and, consequently, of data relations, depending on cultural coordinates, such as parenting and mediation styles, as well as data and digital media imaginaries. Furthermore, we suggest how network methods represent a suitable tool for materialising the mediatized relations structure, providing a set of metrics and visualizations that can foster researchers’ and participants’ reflexivity.

In addition, we believe this approach can be extended beyond the home to understand how data relations reconfigure different communicative figurations.

Wednesday 20 March 2024

Guest Blog: Mapping the landscape of cloud AI: Microsoft, Google, Amazon, and the ‘industrialisation’ of artificial intelligence

By Fernando van der Vlist (@fvandervlist) and Anne Helmond (@silvertje)

Van der Vlist, F. N., Helmond, A., & Ferrari, F. L. (2024). Big AI: Cloud infrastructure dependence and the industrialisation of artificial intelligence. Big Data & Society, 11(1), 1–16.

Convergence of AI and Big Tech—The ongoing competition among tech giants in the ‘cloud AI wars’ is shaping a supposed transformative era. Industry leaders like Bill Gates and Sundar Pichai underscore the foundational role of AI. However, this transformation is chiefly propelled by a select few—Microsoft, Google (Alphabet), and Amazon. These giants hold sway over the cloud computing landscape, wielding profound influence.
Characterising the platformisation and ‘industrialisation’ of AI
Van der Vlist, Helmond, and Ferrari’s comprehensive landscape study, titled ‘Big AI: Cloud infrastructure dependence and the industrialisation of artificial intelligence’, delves into the profound implications of the dominance wielded by these tech giants, introducing the term ‘industrialisation of AI’. This term captures the transition of AI systems from the realm of research and development to practical, ‘real-world’ applications across diverse industries. This transformation brings a new reliance on cloud infrastructure and substantial investments in computational resources, vital for the industrial-scale deployment of AI solutions. Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform emerge as the linchpin cloud platforms underpinning this ongoing industrialisation process.
The ramifications of their influence became glaringly evident during an AWS outage on June 13, 2023. The disruptions faced by clients like the Associated Press, McDonald’s, and Reddit underscored their extensive reliance on AWS. Market estimates emphasise AWS’s dominance, serving as the backbone of the Internet, followed by Microsoft Azure and Google Cloud. The comprehensive suites of cloud products and services offered by these companies not only underscore their dominance but also significantly contribute to their revenues.
Moreover, discursively, the term ‘AI’ acts as a powerful magnet, attracting substantial investments and prompting startups to seek partnerships with major players. This includes (exclusive) cloud provider partnerships such as between Microsoft Azure and OpenAI (powering ChatGPT and DALL·E, amongst others). These tech giants actively position themselves as essential infrastructure providers, pouring billions into costly cloud computing. As AI enters its ‘industrial age’, understanding the intricacies of AI’s value chains becomes crucial for strategic, political, and economic reasons.
The dominance of major tech companies is intrinsically tied to their control over infrastructure. This dominance, fueled by access to vast troves of data, substantial computational resources, and a geopolitical edge, underscores their pivotal role in driving AI development and deployment. As succinctly put by Kak and Myers West, ‘There is no AI without Big Tech’.
A ‘technography’ of AI and Big Tech: Infrastructure, models, and applications
To capture this structural convergence between AI and Big Tech, Van der Vlist et al. conceptualise ‘Big AI’. This term characterises the intricate interdependence between AI and the infrastructure, resources, and investments of major tech conglomerates. This structural dependency is the cornerstone of the ongoing industrialisation of AI. Their empirical analysis further substantiates these critiques. While ‘Big AI’ isn’t the sole trajectory for the future of AI, the continuous provisioning of essential infrastructure services by Microsoft, Google (Alphabet), and Amazon positions them to reap the benefits of AI’s widespread expansion across industry sectors.
In their empirical exploration—characterised as a ‘technography of cloud AI’—, they engage with the material aspects of cloud AI to examine its structural and operational features. They uncover various forms of support and investment and scrutinise the cloud platform offerings from Microsoft, Google, and Amazon. This comprehensive approach provides unique insights into the current state and evolution of ‘Big AI’, offering a profound understanding of AI as both a product and service category, and an integral component of existing cloud computing arrangements. Furthermore, their study sheds light on the developmental and deployment aspects of the purported ‘AI revolution’, heralded by ChatGPT’s launch in late 2022, highlighting the substantial role played by Microsoft, Google, and Amazon in convening enterprises, organisations, and developers, fostering the creation, capture, and commercialisation of AI.


Cloud AI stacks: Structural interconnections among cloud platform products and services offered by Microsoft Azure, Google Cloud Platform, and Amazon Web Services.

Ultimately, the study goes beyond characterising the current ‘platformisation’ of AI, where AI expands beyond consumer-facing applications like ChatGPT to become a platform service provided by Big Tech companies (i.e. an AI platform and infrastructure as a service). This encompasses extensive suites of tools, products, and services—from hardware AI infrastructure to machine learning and computer vision software—, along with ‘platform boundary resources’ for developers and businesses to build upon. The study comprehensively analyses and substantiates this transformation with empirical evidence. It highlights how Big AI represents a dual form of power: first, by owning and offering essential infrastructure and support, and second, by controlling marketplaces for the distribution and deployment of AI models and applications across diverse sectors and industries. Additionally, the study leverages the empirical analysis to conceptualise AI’s cloud infrastructure dependence and the ongoing ‘industrialisation’ of AI, providing important guidance for policymakers and regulators in governing AI.
The full research article is openly available in Big Data & Society at The data that support the findings of this study are openly available in the Open Science Framework (OSF) at



Monday 18 March 2024

Guest Blog: An Exploration of the Ways Agricultural Big Data is Assetized By Agriculture Technology Companies.

By Sarah Marquis

Hackfort, S., Marquis, S., & Bronson, K. (2024). Harvesting value: Corporate strategies of data assetization in agriculture and their socio-ecological implications. Big Data & Society, 11(1).

In the past decade, much attention has been paid to the ways that Big Tech companies like Google and Facebook leverage personal data to create value, both economic and otherwise. Meanwhile, as digital technologies become more ubiquitous in agriculture, we thought it necessary to interrogate the ways that agricultural big data is being assetized in parallel ways.

In this paper, we ask the following question: how is agricultural data transformed into value by the most powerful agribusinesses and ag-tech firms? 

To answer our research question, we read many financial records and annual reports and analyzed earnings calls to see how agricultural data was valued and discussed by multi-national agribusinesses like John Deere, Bayer, BASF and Farmers Edge. We came to several conclusions. The first is that any attempt to systematically examine what agribusinesses do with agricultural data is impaired by legal mechanisms that obfuscate data practices, datasets, and algorithms: copyright, intellectual property law, trade secrecy law, and arbitration agreements all allow for proprietary technologies and a high degree of vagueness and opacity. This is a finding in and of itself; such obfuscation prevents critical analysis and the kind of oversight that the equitable governance of technology requires. Our second, broader argument is that data itself is very likely an asset for agricultural firms, which now uniformly include big data-based services in their portfolios. We outline three strategies that firms use (or are likely to use) to generate value from agricultural data:

 Agribusinesses use agricultural big data to secure relationships in which users are dependent upon them.
 Agribusinesses gain from practices of price-setting and data sharing.
 Agricultural big data is used to develop new products and target marketing materials to users.

The strategies we have identified have socio-ecological implications; they affect social justice, food sovereignty, and sustainability, the latter of which does not always receive due attention in critical data studies (c.f. Gabrys, 2016; Goldstein and Nost, 2022). Our results indicate the reproduction of asymmetrical power relations in the agri-food system favoring corporations and the continuation of long-standing dynamics of inequalities. We can infer that the big data-based predictions agribusinesses sell to farmers are directed toward a productivist model of “surveillance agriculture” (Stone, 2022a) that reinforces existing patterns of unsustainable agro-industrial farming and renders other routes, such as agroecology, peasant farming, and organic farming less legitimate and possible. 

Wednesday 14 February 2024

Guest Blog: A Social Privacy Approach to Investigative Genetic Genealogy

 by Nina de Groot

de Groot, N. F. (2023). Commercial genetic information and criminal investigations: The case for social privacy. Big Data & Society, 10(2).

In 2022, four college students from the University of Idaho were stabbed to death. Law enforcement found DNA at the crime scene that probably originated from the murder suspect, but the DNA profile did not match with the FBI database of criminal offenders. The FBI then used, according to the prosecution, another method: searching in a public genetic genealogy database for relatives of the unknown suspect. Weeks later, they were, most likely based on this lead, able to arrest a suspect. By building extensive family trees of the unknown suspect and the DNA test consumer who is genetically related to the suspect, one can eventually zero in on the suspect by mapping genetic relationships and examining birth and death certificates or social media profiles. This so-called ‘investigative genetic genealogy’ is increasingly used in both the US and Europe.

Investigative genetic genealogy may have significant potential when it comes to crime-solving. Yet, the debate on investigative genetic genealogy tends to be reduced into a dichotomy between protecting privacy on the one hand and serving society by solving horrendous crimes, on the other. As the CEO of a commercial genetic genealogy database has said: “You have a right to privacy. You also have the right not to be murdered or raped”. However, reducing it to this dichotomy is problematic.

In my paper ‘Commercial genetic information and criminal investigations: The case for social privacy’, I propose to consider investigative genetic genealogy through the lens of social privacy. Social privacy is a helpful way to look beyond this simplified dichotomous choice. For example, it is not only about the privacy of the individual DNA-test consumer, but about the privacy of potentially thousands of genetic relatives. If one individual DNA test consumer gives consent to law enforcement use of their data, this person does so for many close and distant relatives.

Perhaps more importantly, a social privacy approach allows for the consideration of broader social- political concerns of investigative genetic genealogy. For example, it allows us to explore the complex issues when it comes to commercial parties entering the scene of criminal investigations, including dependence on commercial actors. Commercial companies currently have a major say in deciding in which crimes to allow (and not allow) law enforcement use of their database. Additionally, public forensic laboratories often do not have the technology to generate the type of DNA profile needed, which leads to the dependence of law enforcement on private actors in this process. Furthermore, investigative genetic genealogy could have problematic consequences for the relationship between citizens and state as well as for the public nature of criminal investigations.

A social privacy approach can also help shed a light on the implications for criminal investigations and the democratic process. Namely, if only a few percent of a population gives consent for law enforcement use of their DNA data, almost every individual within that population can be identified through distant relatives. Therefore, the decisions of only a fraction of the population can have far- reaching implications, reflecting a potentially harmful ‘tyranny of the minority’, which is a concept from digital data ethics. In that respect, insights from the social privacy debate on the interconnectedness of data for the online digital data debate, can be useful for exploring these issues in the context of investigative genetic genealogy.