The editorial team of the journal Big Data & Society will be on break from August 1st to September 4th 2023.
Monday, 5 June 2023
BD&S Journal will be on break from August 1 to September 4, 2024
Monday, 15 May 2023
2023 Call for Special Theme Proposals for Big Data & Society
Call for Special Theme Proposals for Big Data & Society
The SAGE open access journal Big Data & Society (BD&S) is soliciting proposals for a Special Theme to be published in 2024/25. BD&S is a peer-reviewed, interdisciplinary, scholarly journal that publishes interdisciplinary social science research about the emerging field of Big Data practices and how they are reconfiguring relations, expertise, methods, concepts and knowledge across academic, social, cultural, political, and economic realms. BD&S moves beyond usual notions of Big Data to engage with an emerging field of practices that is not defined by but generative of (sometimes) novel data qualities such as extensiveness, granularity, automation, and complex analytics including data linking and mining. The journal attends to digital content generated through online and offline practices, including social media, search engines, Internet of Things devices, and digital infrastructures across closed and open networks, from commercial and government transactions to digital archives, open government and crowd-sourced data. Rather than settling on a definition of Big Data, the Journal makes this an area of interdisciplinary inquiry and debate explored through multiple disciplines and themes.
Special Themes can consist of a combination of Original Research Articles (6 maximum, 10,000 words each), Commentaries (4 maximum, 3,000 words each) and one Editorial Introduction (3,000 words). All Special Theme content will have the Article Processing Charges waived. All submissions will go through the Journal’s standard peer review process.
Past special themes for the journal have included: Knowledge Production; Algorithms in Culture; Data Associations in Global Law and Policy; The Cloud, the Crowd, and the City; Veillance and Transparency; Practicing, Materializing and Contesting Environmental Data; Spatial Big Data; Critical Data Studies; Social Media & Society; Assumptions of Sociality; Data & Agency; Health Data Ecosystems; Algorithmic Normativities; Big Data and Surveillance; The Turn to AI in Governing Communication Online; The Personalization of Insurance; Heritage in a World of Big Data; Studying the COVID-19 Infodemic at Scale; Digital Phenotyping; Machine Anthropology; Data, Power, and Racial Formation; Digital Phenotyping; Social Data Governance; The State of Google Critique and Intervention; Machine Anthropology; and Mapping the Micropolitics of Online Oppositional Subcultures.
See http://journals.sagepub.com/page/bds/collections/index to access these special themes.
While open to submissions on any theme related to Big Data we particularly welcome proposals related to Big Data from the Global South / Global Majority; Indigenous data and data sovereignty; queer and trans data; and Big Data and racialization.
Format of Special Theme Proposals
Researchers interested in proposing a Special Theme should submit an outline with the following information.
An overview of the proposed theme, including how it relates to existing research and the aims and scope of the Journal, and the ways it seeks to expand critical scholarly research on Big Data.
A list of titles, abstracts, authors and brief biographies. For each, the type of submission (ORA, Commentary) should also be indicated. If the proposal is the result of a workshop or conference that should also be indicated.
Short Bios of the Guest Editors including affiliations and previous work in the field of Big Data studies. Links to homepages, Google Scholar profiles or CVs are welcome, although we don’t require CV submissions.
A proposed timing for submission to Manuscript Central. This should be in line with the timeline outlined below.
Information on the types of submissions published by the Journal and other guidelines is available at https://journals.sagepub.com/author-instructions/BDS .
Timeline for Proposals
Please submit proposals by August 15, 2023 to the Editor-in-Chief of the Journal, Prof. Matthew Zook at zook@uky.edu. The Editorial Team of BD&S will review proposals and make a decision by October 2023. Manuscripts would be submitted to the journal (via manuscript central) by or before February 2024. For further information or discuss potential themes please contact Matthew Zook at zook@uky.edu.
Monday, 1 May 2023
Reflections on BD&S during the transition of Editors-In-Chief
In January 2023 the journal Big Data and Society transitioned the Editor-in-Chief from Evelyn Rupert (whose role is now Editor-in-Chief Emeritus and Founding Editor) to the former Managing Editor, Matthew Zook. Jennifer Gabrys has shifted from a co-editor to take on the job of Managing Editor as three new co-editors -- Rocco Bellanova, Ana Valdivia and Jing Zeng -- have join the journal. Details on the full editorial team can be found here.
As part of this transition both Evelyn Rupert and Matthew Zook have written short reflections on the first nine years of the journal and thoughts about where it is going next.
----
Evelyn Rupert: Looking Back on the First Nine Years of Big Data and Society
Since its launch in 2014, Big Data & Society (BD&S) has become a leading journal for interdisciplinary social science research on big data practices. It has been a privilege and honour to have founded and led the journal through its first ten years. As I step down from the Editor in Chief role, I take this opportunity to reflect on its beginnings and changes over the past decade, as well as consider future developments as the journal enters its second decade.
I started to develop a proposal for an interdisciplinary journal on big data in 2012. It was a daunting task as so little had been published about this emerging object in the social sciences. More attention was paid to developments in related phenomena such as the internet, computing and software, digital media and communications, and digital research methods. However, a few authors in the social sciences initiated critical analyses of big data, sometimes referred to as just a buzzword or the latest bandwagon. Much more was published in the humanities, computing and technology, and business. In this context, identifying potential editors, board members, authors, or reviewers was very difficult, especially for a launch issue.
Perhaps more daunting was to specify the very object of the journal itself. ‘Big Data’ was vaguely defined and often criticised. It presented a potentially risky and controversial title for a journal. Rather than settling on a definition, we started with the following lead statement: ‘The Journal's key purpose is to provide a space for connecting debates about the emerging field of Big Data practices and how they are reconfiguring academic, social, industry, business and government relations, expertise, methods, concepts and knowledge.’ That is, we let Big Data be an object of debate (and capitalised the term to signal this), recognising it was and is shaped by myriad practices. What is ‘big’ about Big Data, according to BD&S, are the changing practices of data production, computation, analysis, circulation, implementation, proliferation, and involvement, and the consequences of these practices for how societies are represented (epistemologies), realised (ontologies) and governed (politics). Whether algorithms, AI, bots, or digital infrastructures, such practices engage with a variety of data and--contrary to claims of artificial intelligence--all practices are entangled with human agents, knowledge, power and influence.
It is also worth noting that the journal was launched during a moment of major transformations in journal publishing, which involved a move to digital-only formats, open access and financing through Article Processing Charges (APCs). BD&S was founded on all three changes in publishing, each of which presented challenges and opportunities. Today, none of this is novel. Ten years ago, however, each change constituted important shifts in the field of academic publishing, with APCs especially introducing significant redistributive effects in the dissemination of knowledge. Rather than the subscription model, APCs are now the predominant business model in academic publishing, where access to funding has become critical to publish. While BD&S has been able to provide some APC waivers, the distributive consequences of this funding model require more critical analysis and possible intervention to ensure equity across career stages, location and discipline.
Finally, I want to express my gratitude to all the people over the past ten years who joined the editorial team, including all the co-editors, editorial assistants, assistant editors and editorial board members, who are too many to mention. I am also grateful to the authors and innumerable reviewers, who ventured into relatively new territory and helped shape what the journal has become. A last word of thanks is to SAGE, for their confidence in my leadership and especially to Robert Rojek for his guidance and support over the years.
I leave the journal in good hands and I am impressed by the breadth and depth of the current Editorial Team. Passing the leadership of the journal on to Matt Zook (Editor-in-Chief) and Jennifer Gabrys (Managing Editor) fulfils an important principle of mine: periodically refreshing and changing roles is essential to enable the Journal to be shaped by different people and ideas. One thing is certain: Big Data practices are changing, advancing and, in some cases, becoming more pernicious. Critical interdisciplinary work is not only essential but also--as the contents of the journal demonstrate—proliferating as researchers address, challenge and transform the relations between Big Data and societies.
Wednesday, 11 January 2023
Ground Truth Tracings (GTT): On the Epistemic Limits of Machine Learning
This article is a direct response to the increasing division I have been seeing between what might be called the “technical” and “sociotechnical” communities in artificial intelligence/machine learning (AI/ML). It started as a foray into the industry of machine “listening” with the purpose of examining to what extent practitioners engage with the complexity of voice in developing techniques for listening to and evaluating it. Through my interviews, however, I found that voice, along with many other qualitatively complex phenomena like “employee fit,” “emotion,” and “personality,” gets flattened in the context of machine learning. The piece thus starts with a specific scholarly interest in the interface of voice and machine learning, but ends with a broader commentary on the limitations of machine learning epistemologies as seen through machine listening systems.
Specifically, I develop an intentionally non-mathematical methodological schema called “Ground Truth Tracings” (GTT) to make explicit the ontological translations that reconfigure a qualitative phenomenon like voice into a usable quantitative reference AKA “ground-truthing.” Given that all machine learning systems require a referential database that serves as its ground truth – i.e., what is assumed to be true by the system – examining these assumptions are key to exploring the strengths, weaknesses, and beliefs embedded in AI/ML technologies. In one example, I bring attention to a voice analysis “employee-fit” prediction system that analyzes a potential candidate’s voice to predict whether the individual will be a good fit for a particular team. By using GTT, I qualitatively show why this system is not feasible as an ML use case and unlikely to be as robust as it is marketed to be.
Finally, I acknowledge that although this framework may serve as a useful tool for investigating claims around ML applicability, it does not immediately engage questions of subjectivity, stakes, and power. I thus further splinter this schema through these axes to develop a perhaps imperfect, but practical heuristic called the “Learnability-Stakes” table to assess and think about the epistemological and ethical soundness of machine learning systems, writ large. I’m hoping this piece will contribute to the fostering of interdisciplinary dialogue among the wide range of practitioners in the AI/ML community that includes not just computer scientists and ML engineers, but also social scientists, activists, journalists, policy makers, humanities scholars, and artists, broadly construed.
Tuesday, 13 December 2022
Johann Laux and Fabian Stephany introduce their new paper on "The Concentration-after-Personalisation Index (CAPI)"
Johann Laux and Fabian Stephany introduce their new paper on "The Concentration-after-Personalisation Index (CAPI)" out in Big Data & Society doi:10.1177/20539517221132535. First published December 5, 2022.
Video abstract
Abstract.
Firms are increasingly personalising their offers and services, leading to an ever finer-grained segmentation of consumers online. Targeted online advertising and online price discrimination are salient examples of this development. While personalisation's overall effects on consumer welfare are expectably ambiguous, it can lead to concentration in the distribution of advertising and commercial offers. Constellations are possible in which a market is generally open to competition, but the targeted consumer is only made aware of one possible seller. For the consumer, such a market could effectively resemble a monopoly. We call such extreme cases ‘targeting pockets’. Competition-law metrics such as the Herfindahl–Hirschman Index and traditional means of public oversight of adverts would not detect this concentration. We, therefore, suggest a novel metric, the Concentration-after-Personalisation Index (CAPI). The CAPI treats every consumer as a separate ‘market’, computes a measure of concentration for personalised adverts and offers for each individual consumer separately, and then averages the result to measure the exposure experienced by an average consumer. We demonstrate how the CAPI can serve as a monitoring tool for regulators and auditors and thus help to enforce existing consumer law as well as proposed new regulations such as the European Union's Digital Services Act and its Artificial Intelligence Act. We further show how adding noise via randomly distributed non-personalised adverts can dilute the potential harm of overly concentrated personalisation. We demonstrate how the CAPI can identify the optimal degree of added noise, balancing the protection of consumer choice with the economic interests of advertisers.
Keywords:
Tuesday, 15 November 2022
Learning accountable governance: Challenges and perspectives for data-intensive health research networks
Thursday, 20 October 2022
Jill Rettberg introduces a new paper on, "Algorithmic failure as a humanities methodology: Machine learning's mispredictions identify rich cases for qualitative analysis"
Jill Rettberg introduces a new paper on, "Algorithmic failure as a humanities methodology: Machine learning's mispredictions identify rich cases for qualitative analysis", out in Big Data & Society doi:10.1177/20539517221131290. First published October 18, 2022.
Video abstract
Abstract.
This commentary tests a methodology proposed by Munk et al. (2022) for using failed predictions in machine learning as a method to identify ambiguous and rich cases for qualitative analysis. Using a dataset describing actions performed by fictional characters interacting with machine vision technologies in 500 artworks, movies, novels and videogames, I trained a simple machine learning algorithm (using the kNN algorithm in R) to predict whether or not an action was active or passive using only information about the fictional characters. Predictable actions were generally unemotional and unambiguous activities where machine vision technologies were treated as simple tools. Unpredictable actions, that is, actions that the algorithm could not correctly predict, were more ambivalent and emotionally loaded, with more complex power relationships between characters and technologies. The results thus support Munk et al.'s theory that failed predictions can be productively used to identify rich cases for qualitative analysis. This test goes beyond simply replicating Munk et al.'s results by demonstrating that the method can be applied to a broader humanities domain, and that it does not require complex neural networks but can also work with a simpler machine learning algorithm. Further research is needed to develop an understanding of what kinds of data the method is useful for and which kinds of machine learning are most generative. To support this, the R code required to produce the results is included so the test can be replicated. The code can also be reused or adapted to test the method on other datasets.
Keywords: