Monday, 5 June 2023

BD&S Journal will be on break from August 1 to September 4, 2024

 The editorial team of the journal Big Data & Society will be on break from August 1st to September 4th 2023.  

Please accept any delays in processing and reviewing your submission, and in related correspondence during that time. Thank you!

Monday, 15 May 2023

2023 Call for Special Theme Proposals for Big Data & Society

 Call for Special Theme Proposals for Big Data & Society

The SAGE open access journal Big Data & Society (BD&S) is soliciting proposals for a Special Theme to be published in 2024/25. BD&S is a peer-reviewed, interdisciplinary, scholarly journal that publishes interdisciplinary social science research about the emerging field of Big Data practices and how they are reconfiguring relations, expertise, methods, concepts and knowledge across academic, social, cultural,  political, and economic realms. BD&S moves beyond usual notions of Big Data to engage with an emerging field of practices that is not defined by but generative of (sometimes) novel data qualities such as extensiveness, granularity, automation, and complex analytics including data linking and mining. The journal attends to digital content generated through online and offline practices, including social media, search engines, Internet of Things devices, and digital infrastructures across closed and open networks, from commercial and government transactions to digital archives, open government and crowd-sourced data. Rather than settling on a definition of Big Data, the Journal makes this an area of interdisciplinary inquiry and debate explored through multiple disciplines and themes.


Special Themes can consist of a combination of Original Research Articles (6 maximum, 10,000 words each), Commentaries (4 maximum, 3,000 words each) and one Editorial Introduction (3,000 words). All Special Theme content will have the Article Processing Charges waived. All submissions will go through the Journal’s standard peer review process.

 

Past special themes for the journal have included: Knowledge Production; Algorithms in Culture; Data Associations in Global Law and Policy; The Cloud, the Crowd, and the City; Veillance and Transparency; Practicing, Materializing and Contesting Environmental Data; Spatial Big Data; Critical Data Studies; Social Media & Society; Assumptions of Sociality; Data & Agency; Health Data Ecosystems; Algorithmic Normativities; Big Data and Surveillance; The Turn to AI in Governing Communication Online; The Personalization of Insurance; Heritage in a World of Big DataStudying the COVID-19 Infodemic at Scale; Digital Phenotyping; Machine Anthropology; Data, Power, and Racial Formation; Digital Phenotyping; Social Data Governance; The State of Google Critique and Intervention; Machine Anthropology; and Mapping the Micropolitics of Online Oppositional Subcultures.


See http://journals.sagepub.com/page/bds/collections/index to access these special themes.

 

While open to submissions on any theme related to Big Data we particularly welcome proposals related to Big Data from the Global South / Global Majority; Indigenous data and data sovereignty; queer and trans data; and Big Data and racialization.


Format of Special Theme Proposals

Researchers interested in proposing a Special Theme should submit an outline with the following information.

 

  • An overview of the proposed theme, including how it relates to existing research and the aims and scope of the Journal, and the ways it seeks to expand critical scholarly research on Big Data.

  • A list of titles, abstracts, authors and brief biographies. For each, the type of submission (ORA, Commentary) should also be indicated. If the proposal is the result of a workshop or conference that should also be indicated.

  • Short Bios of the Guest Editors including affiliations and previous work in the field of Big Data studies. Links to homepages, Google Scholar profiles or CVs are welcome, although we don’t require CV submissions.

  • A proposed timing for submission to Manuscript Central. This should be in line with the timeline outlined below.

 

Information on the types of submissions published by the Journal and other guidelines is available at https://journals.sagepub.com/author-instructions/BDS  .

 

Timeline for Proposals

Please submit proposals by August 15, 2023 to the Editor-in-Chief of the Journal, Prof. Matthew Zook at zook@uky.edu. The Editorial Team of BD&S will review proposals and make a decision by October 2023. Manuscripts would be submitted to the journal (via manuscript central) by or before February 2024. For further information or discuss potential themes please contact Matthew Zook at zook@uky.edu.

 


Monday, 1 May 2023

Reflections on BD&S during the transition of Editors-In-Chief

In January 2023 the journal Big Data and Society transitioned the Editor-in-Chief from Evelyn Rupert (whose role is now Editor-in-Chief Emeritus and Founding Editor) to the former Managing Editor, Matthew Zook. Jennifer Gabrys has shifted from a co-editor to take on the job of Managing Editor as three new co-editors -- Rocco Bellanova, Ana Valdivia and Jing Zeng  -- have join the journal. Details on the full editorial team can be found here.

As part of this transition both Evelyn Rupert and Matthew Zook have written short reflections on the first nine years of the journal and thoughts about where it is going next. 

----

Evelyn Rupert: Looking Back on the First Nine Years of Big Data and Society

Since its launch in 2014, Big Data & Society (BD&S) has become a leading journal for interdisciplinary social science research on big data practices. It has been a privilege and honour to have founded and led the journal through its first ten years. As I step down from the Editor in Chief role, I take this opportunity to reflect on its beginnings and changes over the past decade, as well as consider future developments as the journal enters its second decade.

I started to develop a proposal for an interdisciplinary journal on big data in 2012. It was a daunting task as so little had been published about this emerging object in the social sciences. More attention was paid to developments in related phenomena such as the internet, computing and software, digital media and communications, and digital research methods. However, a few authors in the social sciences initiated critical analyses of big data, sometimes referred to as just a buzzword or the latest bandwagon. Much more was published in the humanities, computing and technology, and business. In this context, identifying potential editors, board members, authors, or reviewers was very difficult, especially for a launch issue. 

Perhaps more daunting was to specify the very object of the journal itself. ‘Big Data’ was vaguely defined and often criticised. It presented a potentially risky and controversial title for a journal. Rather than settling on a definition, we started with the following lead statement: ‘The Journal's key purpose is to provide a space for connecting debates about the emerging field of Big Data practices and how they are reconfiguring academic, social, industry, business and government relations, expertise, methods, concepts and knowledge.’ That is, we let Big Data be an object of debate (and capitalised the term to signal this), recognising it was and is shaped by myriad practices. What is ‘big’ about Big Data, according to BD&S, are the changing practices of data production, computation, analysis, circulation, implementation, proliferation, and involvement, and the consequences of these practices for how societies are represented (epistemologies), realised (ontologies) and governed (politics). Whether algorithms, AI, bots, or digital infrastructures, such practices engage with a variety of data and--contrary to claims of artificial intelligence--all practices are entangled with human agents, knowledge, power and influence. 

It is also worth noting that the journal was launched during a moment of major transformations in journal publishing, which involved a move to digital-only formats, open access and financing through Article Processing Charges (APCs). BD&S was founded on all three changes in publishing, each of which presented challenges and opportunities. Today, none of this is novel. Ten years ago, however, each change constituted important shifts in the field of academic publishing, with APCs especially introducing significant redistributive effects in the dissemination of knowledge. Rather than the subscription model, APCs are now the predominant business model in academic publishing, where access to funding has become critical to publish. While BD&S has been able to provide some APC waivers, the distributive consequences of this funding model require more critical analysis and possible intervention to ensure equity across career stages, location and discipline. 

Finally, I want to express my gratitude to all the people over the past ten years who joined the editorial team, including all the co-editors, editorial assistants, assistant editors and editorial board members, who are too many to mention. I am also grateful to the authors and innumerable reviewers, who ventured into relatively new territory and helped shape what the journal has become. A last word of thanks is to SAGE, for their confidence in my leadership and especially to Robert Rojek for his guidance and support over the years.

I leave the journal in good hands and I am impressed by the breadth and depth of the current Editorial Team. Passing the leadership of the journal on to Matt Zook (Editor-in-Chief) and Jennifer Gabrys (Managing Editor) fulfils an important principle of mine: periodically refreshing and changing roles is essential to enable the Journal to be shaped by different people and ideas. One thing is certain: Big Data practices are changing, advancing and, in some cases, becoming more pernicious. Critical interdisciplinary work is not only essential but also--as the contents of the journal demonstrate—proliferating as researchers address, challenge and transform the relations between Big Data and societies.  

Evelyn

----

Matthew Zook: Thoughts on the Success of BD&S and What Happens Next

I still remember my excitement when Evelyn Ruppert first contacted me about joining Big Data & Society (BD&S) as a co-editor in 2013. It was an energizing prospect made even more so by the chance to be part of a group of interdisciplinary social scientists grappling with the many forms and meanings of Big Data practices. Evelyn assembled  a team of scholars I wanted to read and talk with, and an invitation to be part of the editorial team was, in my mind, a front-row seat to the most exciting show in town.

Ten years later, I feel exactly the same. I am regularly astounded by the breadth, quality, and creativity of the articles we publish. They represent world-class scholarship and are agenda-setting in every sense of the word. A key part of this success has been Evelyn Rupert's development of the initial proposal and selection of the first round of co-editors that shaped the vision and voice of the journal. It is no overstatement that without her, BD&S simply would not exist, and for that, I am forever grateful. 

I am also deeply appreciative of the editorial group whose careful and hard work has been instrumental in making BD&S a success. The co-editors - both current and emeriti - have brought a wealth of disciplinary expertise (sociology, politics and law, science and technology studies, geography, journalism, computational social science, data science, planning and policy, media and communications) that they have successfully drawn upon to oversee the review and publication of a broad set of work. Our editorial assistants have done tremendous work behind the scenes to ensure BD&S stays on track, and our assistant editors promote all articles as they appear and oversee new initiatives such as the BD&S Colloquium series begun in 2022. Our editorial board and reviewers have provided vital input on papers that we rely upon to make our editorial decisions, and finally none of this would be possible without authors submitting their work. A heartfelt thank you to everyone. Without all of your hard work, the journal would not be what it is today.

Shifting from reflections on BD&S accomplishments to future plans, I see three principal tasks/challenges set before us. The first concerns expanding topics of inquiry, and we look forward to the exciting new topics, approaches, and theories that our authors bring in their papers. The heart of the journal remains focused on how Big Data interacts with social practices. However, this has continued to evolve in terms of the different deployments of Big Data (e.g., infrastructure, platforms, blockchain), applications (e.g., generative AI, health, identity, nature), and forms of governance (e.g., justice, securitized, privacy) to name but a few of the exciting topics currently under review. Second, there is the ongoing challenge of inclusivity of people and places in the papers we receive and publish. We seek to expand the journal's engagement across geography, practices, and theory, such as (but certainly not limited to) Big Data from the Global South, algorithmic justice, queer/trans data, and indigenous data regimes. Third, is the evolving meaning of Open Access publishing. Being an open access journal has been in the DNA of BD&S since it started, and I want to thank Robert Rojek and Sage for their willingness and ongoing support in making this happen. It has ensured that the good work that our authors write, and we review, gets to as large of a readership as possible. In particular, Sage's ongoing commitment to Research4Life and providing APC waivers has been essential to our ability to run exciting special themes and ensure that no article that we have accepted editorially has been lost due to APCs.

As I look back over my nine years with BD&S, five years as a co-editor and the last five as Managing Editor, I am amazed at the scope and scale of what the journal has done. Close to 1,200 authors have published 600+ articles both as stand-alone pieces and as part of 25+ special themes curated by guest editors. It is tremendously heartening to be part of such an intellectual community. As I step into the role of Editor-in-Chief I look to this community to continue to support and define the journal's work as we move forward. I am especially pleased to be working with Jennifer Gabrys (in her new role as Managing Editor), our co-editors: Rocca Bellanova, Dhiraj Murthy, Sung-Yueh Perng, Sachil Singh, Ana Valdivia, and Jing Zeng and the journal's Editorial Assistant, Natalia Orrego. 

I am grateful for my time with the journal and am looking forward to the years ahead. It is still a front row seat to the most exciting show in town.

Matt



Wednesday, 11 January 2023

Ground Truth Tracings (GTT): On the Epistemic Limits of Machine Learning

by Edward B. Kang (@edwardbkang)

Kang, E. B. (2023). Ground truth tracings (GTT): On the epistemic limits of machine learning. Big Data & Society, 10(1). https://doi.org/10.1177/20539517221146122  

This article is a direct response to the increasing division I have been seeing between what might be called the “technical” and “sociotechnical” communities in artificial intelligence/machine learning (AI/ML). It started as a foray into the industry of machine “listening” with the purpose of examining to what extent practitioners engage with the complexity of voice in developing techniques for listening to and evaluating it. Through my interviews, however, I found that voice, along with many other qualitatively complex phenomena like “employee fit,” “emotion,” and “personality,” gets flattened in the context of machine learning. The piece thus starts with a specific scholarly interest in the interface of voice and machine learning, but ends with a broader commentary on the limitations of machine learning epistemologies as seen through machine listening systems.
 
Specifically, I develop an intentionally non-mathematical methodological schema called “Ground Truth Tracings” (GTT) to make explicit the ontological translations that reconfigure a qualitative phenomenon like voice into a usable quantitative reference AKA “ground-truthing.” Given that all machine learning systems require a referential database that serves as its ground truth – i.e., what is assumed to be true by the system – examining these assumptions are key to exploring the strengths, weaknesses, and beliefs embedded in AI/ML technologies. In one example, I bring attention to a voice analysis “employee-fit” prediction system that analyzes a potential candidate’s voice to predict whether the individual will be a good fit for a particular team. By using GTT, I qualitatively show why this system is not feasible as an ML use case and unlikely to be as robust as it is marketed to be.
 
Finally, I acknowledge that although this framework may serve as a useful tool for investigating claims around ML applicability, it does not immediately engage questions of subjectivity, stakes, and power. I thus further splinter this schema through these axes to develop a perhaps imperfect, but practical heuristic called the “Learnability-Stakes” table to assess and think about the epistemological and ethical soundness of machine learning systems, writ large. I’m hoping this piece will contribute to the fostering of interdisciplinary dialogue among the wide range of practitioners in the AI/ML community that includes not just computer scientists and ML engineers, but also social scientists, activists, journalists, policy makers, humanities scholars, and artists, broadly construed. 

Tuesday, 13 December 2022

Johann Laux and Fabian Stephany introduce their new paper on "The Concentration-after-Personalisation Index (CAPI)"

Johann Laux and Fabian Stephany introduce their new paper on "The Concentration-after-Personalisation Index (CAPI)" out in Big Data & Society  doi:10.1177/20539517221132535. First published December 5, 2022.

Video abstract



Abstract.

Firms are increasingly personalising their offers and services, leading to an ever finer-grained segmentation of consumers online. Targeted online advertising and online price discrimination are salient examples of this development. While personalisation's overall effects on consumer welfare are expectably ambiguous, it can lead to concentration in the distribution of advertising and commercial offers. Constellations are possible in which a market is generally open to competition, but the targeted consumer is only made aware of one possible seller. For the consumer, such a market could effectively resemble a monopoly. We call such extreme cases ‘targeting pockets’. Competition-law metrics such as the Herfindahl–Hirschman Index and traditional means of public oversight of adverts would not detect this concentration. We, therefore, suggest a novel metric, the Concentration-after-Personalisation Index (CAPI). The CAPI treats every consumer as a separate ‘market’, computes a measure of concentration for personalised adverts and offers for each individual consumer separately, and then averages the result to measure the exposure experienced by an average consumer. We demonstrate how the CAPI can serve as a monitoring tool for regulators and auditors and thus help to enforce existing consumer law as well as proposed new regulations such as the European Union's Digital Services Act and its Artificial Intelligence Act. We further show how adding noise via randomly distributed non-personalised adverts can dilute the potential harm of overly concentrated personalisation. We demonstrate how the CAPI can identify the optimal degree of added noise, balancing the protection of consumer choice with the economic interests of advertisers.


Keywords: 

  1. Targeted advertising
  2. personalisation
  3. consumer welfare
  4. consumer protection
  5. consumer law
  6. competition law
  7. EU law
  8. platform regulation
  9. Digital Services Act
  10. AI ActUnfair Commercial Practices Directive
  11. digital markets
  12. law and economics
  13. novel metrics

Tuesday, 15 November 2022

Learning accountable governance: Challenges and perspectives for data-intensive health research networks

by Sam Muller

Muller, S. H. A., Mostert, M., van Delden, J. J. M., Schillemans, T., & van Thiel, G. J. M. W.(2022). Learning accountable governance: Challenges and perspectives for data-intensive health research networks. Big Data & Society, 9(2). https://doi.org/10.1177/20539517221136078

In our article, we address the accountability of large-scale health data research. Accountability is crucial to ensure democratic control and to steer health data research to contribute public value. Yet whereas previous research about health data paid much attention to accountability as a norm for doing and organising health data research, it did not specify what accountability processes should look like in practice. Specifically, previous research did not take into account that much health data research takes place in international networks, in which public and private organisations collaborate internationally and in a relatively horizontal way.
 
In our analysis of the current state of accountability, we found that governing such networks to foster accountability faces several challenges. The fact that health data research takes place in complex networks puts a lot of pressure on realizing clear and stable accountability relationships. Moreover, smooth cooperation is difficult due to unclarity of norms and principles which could guide accountability processes. Lastly, effective design of information provision and debate is lacking.
 
To complement the shortcomings of current accountability in health data research networks, we propose focusing on accountability as a means of learning from insights and feedback about how good governance can be achieved. We suggest two pathways for pursuing learning accountability. First, an integrated governance structure for learning to occur needs to be developed. Provisional goals need to be established by building on overlapping consensus. This is crucial to develop mutual understanding, shared motivation and common commitment between organisations engaged in health data research. Second, ongoing deliberation and open communication about collaboration are required for reflexive dialogue. Stakeholders (publics and communities affected by health data research) should be represented and enabled to participate. Empowering them in the form of a collective forum enables learning from their experiences and holding health data research to account.

Thursday, 20 October 2022

Jill Rettberg introduces a new paper on, "Algorithmic failure as a humanities methodology: Machine learning's mispredictions identify rich cases for qualitative analysis"

Jill Rettberg introduces a new paper on, "Algorithmic failure as a humanities methodology: Machine learning's mispredictions identify rich cases for qualitative analysis", out in Big Data & Society  doi:10.1177/20539517221131290. First published October 18, 2022.

Video abstract



Abstract.

This commentary tests a methodology proposed by Munk et al. (2022) for using failed predictions in machine learning as a method to identify ambiguous and rich cases for qualitative analysis. Using a dataset describing actions performed by fictional characters interacting with machine vision technologies in 500 artworks, movies, novels and videogames, I trained a simple machine learning algorithm (using the kNN algorithm in R) to predict whether or not an action was active or passive using only information about the fictional characters. Predictable actions were generally unemotional and unambiguous activities where machine vision technologies were treated as simple tools. Unpredictable actions, that is, actions that the algorithm could not correctly predict, were more ambivalent and emotionally loaded, with more complex power relationships between characters and technologies. The results thus support Munk et al.'s theory that failed predictions can be productively used to identify rich cases for qualitative analysis. This test goes beyond simply replicating Munk et al.'s results by demonstrating that the method can be applied to a broader humanities domain, and that it does not require complex neural networks but can also work with a simpler machine learning algorithm. Further research is needed to develop an understanding of what kinds of data the method is useful for and which kinds of machine learning are most generative. To support this, the R code required to produce the results is included so the test can be replicated. The code can also be reused or adapted to test the method on other datasets.

Keywords: 

  1. Machine vision,
  2. machine learning,
  3. qualitative methodology,
  4. machine anthropology,
  5. digital humanities,
  6. algorithmic failure