Monday, 22 February 2021

The Algorithm Audit: Scoring the algorithms that score us

by Shea Brown

Big Data & Society. doi:10.1177/2053951720983865. First published: January 28th 2021.

“The Algorithm Audit: Scoring the algorithms that score us” outlines a conceptual framework for auditing algorithms for potential ethical harms. In recent years, the ethical impact of AI has been increasingly scrutinized, and has led to a growing mistrust of AI and increased calls for mandated audits of algorithms. While there are many excellent proposals for ethical assessments of algorithms, such as Algorithmic Impact Assessments or the similar Automated Decision System Impact Assessments, these are too high level to be put directly into practice without further guidance. Other proposals have a more narrow focus on technical notions of bias or transparency (Mitchell et al., 2019). Moreover, without a unifying conceptual framework for carrying out these evaluations, there’s a worry that the ad hoc nature of the methodology could lead to potential harms being missed. 

We present an auditing framework that can serve as a more practical guide for comprehensive ethical assessments of algorithms. We clarify what we mean by an algorithm audit, explain key preliminary steps to any such audit (identifying the purpose of the audit, describing and circumscribing its context) and elaborate on the three main elements of the audit instrument itself: (i) a list of possible interests and rights of stakeholders affected by the algorithm, (ii) a list and assessment of metrics that describe key ethically salient features of the algorithm in the relevant context, and (ii) a relevancy matrix that connects the assessed metrics to the stakeholder interests.  We provide a simple example to illustrate how the audit is supposed to work, and discuss the different forms the audit result could take (quantitative score, qualitative score, and a narrative assessment).  

Our motivations for this separation of descriptive (metrics) and normative (interests) features are many, but one important reason is that this separation forces an auditor to carefully consider each stakeholder explicitly, and consider the possible relevance of various features of the algorithm (metrics) to that stakeholder’s interests. It’s important to note that different stakeholders in the same category (e.g. students, loan applicants, those up for parole, etc.) are often affected in very different ways by the same algorithm and often on the basis of race, ethnicity, gender, age, religion, or sexual orientation (Benjamin, 2019). We argue that understanding the context of an algorithm is a precursor to being able to not only enumerate stakeholder interests generally, but also to be able to identify particular sub-categories of stakeholders whose identification is relevant for ethical assessment of an algorithm (e.g. students of color, Hispanic loan applicants, male African-Americans up for parole, etc.). These stakeholders might face particular threats, and attention to context allows us to guard against thinking of groups of stakeholders are homogeneous entities that will be negatively or positively affected simply in virtue of the type of engagement with an algorithm, and to recognize socio-political and socio-technical factors, and power dynamics at play (Benjamin, 2019; D’Ignazio and Klein, 2020; Mohamed et al., 2020).

The proposed audit instrument yields an ethical evaluation of an algorithm that could be used by regulators and others interested in doing due diligence, while paying careful attention to the complex societal context within which the algorithm is deployed. It can also help institutions mitigate the reputational, financial, and ethical risk that a poorly performing algorithm might present.