Monday, 7 December 2015

Revisioning the Social and Human Sciences with Big Data: A Colloquium

By John W. Mohr, Ronald L. Breiger and Robin Wagner-Pacifici

The Special Theme that we have edited for Big Data & Society is entitled “Conceiving the Social with Big Data: A Colloquium of Social and Cultural Scientists.” It brings together 18 short essays by a number of scholars who are “early adopters” of new methods of analyzing Big Data to address issues in the social and human sciences.  The question we asked our contributors was how the use of these methods and these types of data can lead to different (implicit or explicit) understandings about how to think about the social.  As these essays make clear, there is also the important question of how working with Big Data can lead to changes at a deep level in researchers’ conceptions of the nature of science.

The resulting collection contains remarkable and incisive essays that raise a wide range of issues about how Big Data is increasingly implicated in practices and theorization in the humanities and social sciences.  As a way to synthesize this material we wrote a substantive introductory essay, “Ontologies, Methodologies and the Uses of Big Data in the Social and Cultural Sciences,” that summarizes our perspective on the essays and on how they raise a number of “puzzles about the locus and nature of human life, the nature of interpretation, the categorical constructions of individual entities and agents, the nature and relevance of contexts and temporalities, and the determinations of causality.”  We organize this discussion around a series of analytic binaries: Life/Data, Mind/Machine, and Induction/Deduction.

But the back story to this special theme is also interesting.  It begins with the Theory Section of the American Sociological Association (ASA).  Robin Wagner-Pacifici was the chair of this section in 2014-2015 and responsible for coming up with the conference program for the meetings in San Francisco.  She asked John Mohr to organize a session on the topic “Theory in the Era of Big Data.”  Mohr invited four papers and he asked Ronald Breiger to serve as the discussant for the panel.  The session itself was charged with energy and presented to a standing room only crowd.  In the discussion period, Gene Johnsen, a mathematician who works on network theory, stood up to request that we find a way to publish the papers (as a group) so that people could gain access to the materials and ideas that had been presented that day.

While Gene’s request was the original impetus for this collection we credit Kevin Lewis with the innovation that brought the special issue to fruition.  He suggested that we design this as a forum for short essays in which authors could reflect upon their own experiences in a more theoretical and reflexive manner thereby enabling a type of writing that they might not be able to express in a more conventional research article.  We quickly saw the wisdom in this vision and agreed.  We were able to include three of the original panelists who presented at the ASA Theory Session and they are represented here by the essay on “Wikipedia, Sociology, and the Promise and Pitfalls of Big Data” by Julia Adams and Hannah Brueckner, “The Paradox of Active Users,” by Patrick Park  and Michael Macy and “Big Data and the Danger of Being Precisely Inaccurate” by Daniel McFarland.  Ronald Breiger’s remarks as discussant formed the basis for the essay that he publishes here entitled “Scaling Down.”

From there we began to add to our list of sociologists who have been working in significant new ways with Big Data.  We invited Chris Bail (“Lost in a Random Forest: Using Big Data to Study Rare Events”), Paul DiMaggio (“Adapting Computational Text Analysis to Social Science (and Vice Versa”),  Amir Goldberg (“In Defense of Forensic Social Science”), Tim Hannigan (“Close Encounters of the Conceptual Kind:  Disambiguating Social Structure from Text”), Monica Lee and John Martin (“Surfeit and Surface”), Sophie Mützel (“Facing Big Data: Making Sociology Relevant ”) and Wouter de Nooy (“Structure from Interaction Events ”) to join the party. Peter Bearman (“Big Data and Historical Social Science”) was a late addition. Also, with more room to maneuver, we invited key contributors beyond sociology, including two innovative scholars from the humanities, Rachel Buurma (“Topic Modeling Against Totality: Anthony Trollope’s Barsetshire Series”) and Ted Underwood (“The Literary Uses of High-Dimensional Space”), and two pioneers from the information sciences, Jana Diesner (“Small Decisions with Big Impact for Data Analytics”) and Ryan Shaw (“Big Data and Reality”) to consider and reflect upon these same matters.

We approached Editor Evelyn Ruppert about publishing the collection in what was (at the time) this still rather new journal. She and her board were very supportive and the rest, as they say, is history. Or rather, the rest is this quite extraordinary collection of essays on the impacts of Big Data on the social and human sciences that we invite you to explore.