The ethics of examples in machine learning
Our article investigates a perceived transition from a rules-based programming paradigm in computing to one in which machine learning systems are said to learn from examples. Instead of specifying computational rules in a formal programming language, machine learning systems identify statistical structure in a dataset in order to accomplish tasks. There are many studies that show how data is constructed in various ways. We make a more specific argument: that in machine learning, data must be made exemplary—aggregated, formatted, and processed so that norms can emerge—to enable desired predictive or classificatory objectives.
We are most interested in the ethical and even political ramifications of this transition. How does being governed by examples, by machine learning's specific type of predictions and classifications, differ from the rule of computational rules? How, concretely, is authority exercised by machine learning techniques? If you would like answers to these questions, please read our article!
A larger issue is why speak in terms of rules and examples in the first place. We are aware that these themes may strike readers as unfamiliar in light of existing critical research on algorithms and artificial intelligence. Many studies have, with good reason, focused on the discriminatory or unequal effects of machine learning systems. Other, more conceptual work posits some external standard (consciousness, language, intelligence, neoliberalism etc.) and evaluates whether or not machine learning systems measure up to it, or whether they are mere "stochastic parrots," for example. We are sympathetic to both of these avenues, and they will continue to bear fruit.
Our article, however, begins from a different position: not from "outside" of machine learning, but from within it. We were first struck by repeated references to "learning from examples" made by machine learning researchers themselves. This community even has an informal historical understanding in which an overemphasis on highly-specified formal rules led to failure in previous forms of AI, notably expert systems. Our first task, then, was to discern how examples work in machine learning. We argue that data is made exemplary, that is, capable of eliciting norms, through a set of technical practices that characterize machine learning, including labeling, feature engineering, and scaling.
Merely characterizing how these terms are used within the machine learning community risks reproducing their views. After beginning our study up close or from within, we then examine these practices "from afar"—to identify their epistemological, ethical, and political implications. We theorize examples through historical-conceptual comparison, in contrast to rules and closely analyze several case-studies under the headings of labeling, feature engineering, and scaling. Our article draws from both classics, such as the work of Max Weber, and contemporaries, such as Lorraine Daston's very recent book Rules: A Short History of What We Live By.
This comparative approach situates machine learning within a constellation of concepts from social theory such as rationalization, calculation, and prediction. It connects machine learning to longer-running historical forces while also making its specific type of authority intelligible: how, precisely, do we use it to govern ourselves and others. Comparing rules and examples brought a number of other philosophical oppositions to light—specification and emergence, prompts and commands, the implicit and the explicit, the general and the particular, is and ought... These indicate further lines of research both for ourselves and hopefully our readers.