Joi Ito's Web

Joi Ito's conversation with the living web.

March 2019 Archives

This is the second of three parts of the syllabus and summaries prepared by Samantha Bates who TAs the Applied Ethical and Governance Challenges in Artificial Intelligence course which I co-teach with Jonathan Zittrain. John Bowers and Natalie Satiel are also TAs for the course. I posted Part I earlier in the month.

My takeaways:

In Part I, we defined the space and tried to frame and understand some of the problems. We left with concerns about the reductionist, poorly defined and oversimplified notions of fairness and explainability in much of the literature. We also left feeling quite challenged by how the technical community will face new risks such as adversarial attacks and approaches like it.

In Part II, we continue our journey into a sense of despair about AI ethics and governance. In Solon Barocas and Andrew D. Selbst's paper "Big Data's Disparate Impact," they walk us through the state of the law around discrimination and fairness using Title VII of the US Civil Rights Act as an example. The authors show us that while it was enacted to address discrimination concerns raised by the civil rights movement, the law has evolved away from trying to correct societal inequities through remedies such as affirmative action. Instead, the law has focused more and more on fairness of processes and less on redistribution or on resolving historical inequity. As a result, the law has adopted a more technical notion of fairness - a kind of actuarial "all lives matter" sort of approach. During Part I, when we discussed the biased Amazon hiring tool, one of the proposed remedies was to "put our thumb on the scale" and just boost the scores of women and minorities. The Barocas and Selbst paper demonstrates that this type of solution is no longer supported by the law. The sense is that the engineers thought, "of course there must be a law prohibiting discrimination, we can use that." In fact, that law punts on redistribution or societal inequity. Jonathan pointed out that treatment of social inequality in Tort law is similar. If you run over a rich person and a poor person at the same time, you have to pay the rich family more - the calculation of damages is based on the victim's future earning power. Tort law, like Title VII, says, "there may be societal inequities, but we're not solving that problem here."

Sandra Wachter's paper proposing counterfactuals as a way to provide explainability is an excellent idea and feels like one way forward in the explainability debate. However, even Sandra seems concerned about whether laws such as the GDPR will actually be able to require companies to provide such explanations. We also had some concerns about the limits of counterfactuals in identifying biases or providing the "best" answer depending on the person - limits Sandra identifies in her paper.

Finally, we take adversarial attacks from the theoretical to a specific example in a recent paper that Jonathan and I wrote with John Bowers, Samuel Finlayson, Andrew L. Beam, and Isaac S. Kohane about the risks of adversarial attacks on medical AI systems.

Please see Samantha's summaries and links to the readings below for a more complete overview of the three classes in Part II.

- Joi

Part 2: Prognosis

By Samantha Bates

Syllabus Notes: Prognosis Stage

Welcome to part 2 of our Ethical and Governance Challenges in AI syllabus! In part 1, the assigned readings and class discussion focused on understanding how the social, technical, and philosophical roots of autonomous systems contribute to problems related to fairness, interpretability, and adversarial examples. In the second stage of the course, the prognosis stage, the class considered the social implications of these problems. Perhaps the most significant takeaway from this stage was the realization that many of these problems are social or political problems and cannot be addressed through solely a legal or technical approach.

Class Session 5: Prognosticating the impacts of unfair AI

Solon Barocas, an Assistant Professor at Cornell University, joined the class for the first day of the prognosis stage. We discussed his paper, "Big Data's Disparate Impact," which offered a legal and technical perspective on the use of algorithms in employment.

On the first day of the prognosis stage, the focus of the class shifted from examining the technical mechanisms underlying autonomous systems to looking at the societal impact of those systems. The Barocas and Selbst paper discusses how data collection and data labeling can perpetuate existing biases both intentionally and unintentionally. The authors outline five main ways that datasets can be discriminatory:

  1. Our own human biases may be integrated into a dataset when a human data miner determines the parameters that an autonomous system will use to make decisions.

  2. The training data might already be biased depending on how it was collected and how it was labeled.

  3. Data mining models consider a limited number of data points and thus may draw conclusions about an individual or a group of people based on data that is not representative of the subject.

  4. As Cathy O'Neil mentioned, prejudice may be introduced if the data points the model uses to make decisions are proxies for class membership.

  5. Discriminatory data mining could be intentional. However, the authors argue that unintentional discrimination is more common and harder to identify.

While there is legal doctrine that addresses discrimination in employment, the authors demonstrate that it is difficult to apply in practice, particularly in the data mining context. Title VII creates liability for intentional discrimination (disparate treatment) and for unintentional discrimination (disparate impact), but it is difficult to prove either type. For example, in order to hold employers liable for unintentional discrimination, the plaintiff must show that an alternative, nondiscriminatory method exists that will accomplish the same goals as the discriminatory practice. They must also prove that when presented with the alternative, the employer refused to consider it. Typically, an employer can mount a successful defense if they can prove they were unaware of the alternative or if there is a legitimate business reason for policies that may be discriminatory (the business necessity defense).

Bias in data mining is so difficult to identify, prove, and rectify in part because as a society, we have not determined the role of the law in addressing discrimination. According to one theory, the anticlassification theory, the law has an obligation to ensure that decision makers do not discriminate against protected classes in society. The opposing theory, the antisubordination theory, advocates a more hands-on approach and states that the law should work to "eliminate status-based inequality" at the societal level by actively improving the lives of marginalized groups. Our current society favors the anticlassification approach in part because the court established early on that antidiscrimination law was not solely intended to improve access to opportunities for protected classes. And while the authors demonstrate how data mining can exacerbate existing biases in the hiring context, there is a societal trade-off between prioritizing efficient decision making and eliminating bias.

This reading also raises the question of who is responsible for fixing the problem. Barocas and Selbst emphasize that the majority of data mining bias is unintentional and that it may be very difficult to identify bias and employ technical fixes to eliminate it. At the same time, there are political and social factors that make fixing this problem in the legal system equally difficult, so who should be in charge of addressing it? The authors suggest that as a society, we may need to reconsider how we approach discrimination issues more generally.

Class Session 6: Prognosticating the impacts of uninterpretable AI

For our sixth session, the class talked with Sandra Wachter, a lawyer and research fellow at the Oxford Internet Institute, about the possibility of using counterfactuals to make autonomous systems interpretable.

In our last discussion about interpretability, the class concluded that it is impossible to define the term "interpretability" because it greatly depends on the context of the decision and the motivations for making the model interpretable. The Sandra Wachter et al. paper essentially says that defining "interpretability" is not important and that instead we should focus on providing a way for individuals to learn how to change or challenge a model's output. While the authors point out that making these automated systems more transparent and devising some way to hold them accountable will improve the public's trust in AI, they primarily consider how to design autonomous models that will meet the explanation requirements of the GDPR. The paper's proposed solution is to generate counterfactuals for individual decisions (both positive and negative) that "provide reasons why a particular decision was received, offer grounds to contest it, and provide limited 'advice' on how to receive desired results in the future."

Not only would counterfactuals exceed the explainability requirements of the GDPR, the authors argue that counterfactuals would lay the groundwork for a legally binding right to explanation. Due to the difficulty of explaining the technical workings of an automated model to a lay person, legal concerns about protecting trade secrets and IP, and the danger of violating the privacy of data subjects, it has been challenging to provide more transparency around AI decision making. However, counterfactuals can serve as a workaround to these concerns because they indicate how a decision would change if certain inputs had been different rather than disclose information about the internal workings of the model. For example, a counterfactual for a bank loan algorithm might tell someone who was denied a loan that if their annual income had been $45,000 instead of $30,000, they would have received the loan. Without explaining any of the technical workings of the model, the counterfactual in this example can tell the individual the rationale behind the decision and how they can change the outcome in the future. Note that counterfactuals are not a sufficient solution to problems involving bias and unfairness. It may be possible for counterfactuals to provide evidence that a model is biased. However, because counterfactuals only show dependencies between a specific decision and particular external facts, they cannot be relied upon to expose all potential sources of bias or confirm that a model is not biased.

The optional reading, "Algorithmic Transparency for the Smart City," investigates the transparency around the use of big data analytics and predictive algorithms by city governments. The authors conclude that poor documentation and disclosure practices as well as trade secrecy concerns frequently prevented city governments from getting the information they needed to understand how the model worked and its implications for the city. The paper expands upon the barriers to understanding an autonomous model that are mentioned in the Watcher et. al. paper and also presents a great example of scenarios in which counterfactual explanations could be deployed.

Class Session 7: Prognosticating the impacts of adversarial examples

In our third prognosis session, the class continued its discussion about adversarial examples and considered potential scenarios, specifically in medical insurance fraud, in which they could be used to our benefit and detriment.

  • "Adversarial attacks on artificial intelligence systems as a new healthcare policy consideration" by Samuel Finlayson, Joi Ito, Jonathan Zittrain et al., preprint (2019)

  • "Law and Adversarial Machine Learning" by Ram Shankar Siva Kumar et al., ArXiv (2018)

In our previous session about adversarial examples, the class discussion was primarily focused on understanding how adversarial examples are created. The readings delve more into how adversarial examples can be used to our benefit and also to our detriment. "Adversarial attacks on artificial intelligence systems as a new healthcare policy consideration" considers the use of adversarial examples in health insurance fraud. The authors explain that doctors sometimes use a practice called "upcoding", when they submit insurance claims for procedures that are much more serious than were actually performed, in order to receive greater compensation. Adversarial examples could exacerbate this problem. For instance, a doctor could make slight perturbations to an image of a benign mole that causes an insurance company's autonomous billing code infrastructure to misclassify it as a malignant mole. Even as insurance companies start to require additional evidence that insurance claims are valid, adversarial examples could be used to trick their systems.

While insurance fraud is a serious problem in medicine, it is not always clearly fraudulent. There are also cases when doctors might use upcoding to improve a patient's experience by making sure they have access to certain drugs or treatments that would ordinarily be denied by insurance companies. Similarly, the "Law and Adversarial Machine Learning" paper encourages machine learning researchers to consider how the autonomous systems they build can both benefit individual users and also be used against them. The authors caution researchers that oppressive governments may use the tools they build to violate the privacy and free speech of their people. At the same time, people living in oppressive states could employ adversarial examples to evade the state's facial recognition systems. Both of these examples demonstrate that deciding what to do about adversarial examples is not straightforward.

The papers also make recommendations for crafting interventions for problems caused by adversarial examples. In the medical context, the authors suggest that the "procrastination principle," a concept from the early days of the internet that argued against changing the Internet's architecture to preempt problems, might be applicable to adversarial examples as well. The authors caution that addressing problems related to adversarial examples in healthcare too early could create ineffective regulation and prevent innovation in the field. Instead the authors propose extending existing regulations and taking small steps, such as creating "fingerprint" hashes of the data submitted as part of an insurance claim, to address concerns about adversarial examples.

In the "Law and Adversarial Machine Learning" paper, the authors emphasize that lawyers and policymakers need help from machine learning researchers to create the best machine learning policies possible. As such, they recommend that machine learning developers assess the risk of adversarial attacks and evaluate existing defense systems on their effectiveness in order to help policymakers understand how laws may be interpreted and how they should be enforced. The authors also suggest that machine learning developers build systems that make it easier to determine whether an attack has occurred, how it occurred and who might be responsible. For example, designers could devise a system that can "alert when the system is under adversarial attack, recommend appropriate logging, construct playbooks for incident response during an attack, and formulate a remediation plan to recover from an attack." Lastly, the authors remind machine learning developers to keep in mind how machine learning and adversarial examples may be used to both violate and protect civil liberties.

Credits

Notes by Samantha Bates

Syllabus by Samantha Bates, John Bowers and Natalie Saltiel

Jonathan Zittrain and I are co-teaching a class together for the third time. This year, the title of the course is Applied Ethical and Governance Challenges in Artificial Intelligence. It is a seminar, which means that we invite speakers for most of the classes and usually talk about their papers and their work. The speakers and the papers were mostly curated by our amazing teaching assistant team - Samantha Bates, John Bowers and Natalie Satiel.

One of the things that Sam does is help prepare for the class by summarizing the paper and the flow of the class and I realized that it was a waste for this work to just be crib notes for the instructors. I asked Sam for permission to publish the notes and the syllabus on my blog as a way for people to learn some of what we are learning and start potentially interesting conversations.

The course is structured as three sets of three classes on three focus areas. Previous classes were more general overviews of the space, but as the area of research matured, we realized that it would be more interesting to go deep in key areas than to go over what a lot of people probably already know.

We chose three main topics: fairness, interpretability, and adversarial examples. We then organized the classes to hit each topic three times, starting with diagnosis (identifying the technical root of the problem), then prognosis (exploring the social impact of those problems) then intervention (considering potential solutions to the problems we've identified while taking into account the costs and benefits of each proposed solution). See the diagram below for a visual of the structure.

The students in the class are half MIT and half Harvard students with diverse areas of expertise including software engineering, law, policy and other fields. The class has really been great and I feel that we're going deeper on many of the topics than I've ever gone before. The downside is that we are beginning to see how difficult the problems are. Personally, I'm feeling a bit overwhelmed by the scale of the work we have ahead of us to try to minimize the harm to society by the deployment of these algorithms.

We just finished the prognosis phase and are about to start intervention. I hope that we find something to be optimistic about as we enter that phase.

Please find below the summary and the syllabus for the introduction and the first phase - the diagnosis phase - by Samantha Bates along with links to the papers.

The tl;dr summary of the first phase is... we have no idea how to define fairness and it probably isn't reducible to a formula or a law, but it is dynamic. Interpretability sounds like a cool word, but as Zachary Lipton said in his talk to our class, it is a "wastebasket taxon" like the word "antelope" where we call anything that sort of looks like an antelope, an antelope, even if it has really no relationship with other antelopes. A bunch of students from MIT made it very clear to us that we are not prepared for adversarial attacks and that it was unclear whether we could build algorithms that were both robust against these attacks and still functionally effective.

Part 1: Introduction and Diagnosis

By Samantha Bates

Syllabus Notes: Introduction and Diagnosis Stage

This first post summarizes the readings assigned for the first four classes, which encompasses the introduction and the diagnosis stage. In the diagnosis stage, the class identified the core problems in AI related to fairness, interpretability, and adversarial examples and considered how the underlying mechanisms of autonomous systems contributed to those problems. As a result, our class discussions involved defining terminology and studying how the technology works. Included below is the first part of the course syllabus along with notes summarizing the main takeaways from each of the assigned readings.

Class Session 1: Introduction

In our first class session, we presented the structure and motivations behind the course, and set the stage for later class discussions by assigning readings that critique the current state of the field.

Both readings challenge the way Artificial Intelligence (AI) research is currently conducted and talked about, but from different perspectives. Michael Jordan's piece is mainly concerned with the need for more collaboration across disciplines in AI research. He argues that we are experiencing the creation of a new branch of engineering that needs to incorporate non-technical as well as engineering challenges and perspectives. "Troubling Trends in Machine Learning Scholarship" focuses more on falling standards and non-rigorous research practices in the academic machine learning community. The authors rightly point out that academic scholarship must be held to the highest standards in order to preserve public and academic trust in the field.

We chose to start out with readings that critique the current state of the field because they encourage students to think critically about the papers they will read throughout the semester. Just as the readings show that the use of precise terminology and explanation of thought are particularly important to prevent confusion, we challenge students to carefully consider how they present their own work and opinions. The readings set the stage for our deep dives into specific topic areas (fairness, interpretability, adversarial AI) and also set some expectations about how students should approach the research we will discuss throughout the course.

Class Session 2: Diagnosing problems of fairness

For our first class in the diagnosis stage, the class was joined by Cathy O'Neil, a data scientist and activist who has become one of the leading voices on fairness in machine learning.

Cathy O'Neil's book, Weapons of Math Destruction, is a great introduction to predictive models, how they work, and how they can become biased. She refers to flawed models that are opaque, scalable, and have the potential to damage lives (frequently the lives of the poor and disadvantaged) as Weapons of Math Destruction (WMDs). She explains that despite good intentions, we are more likely to create WMDs when we don't have enough data to draw reliable conclusions, use proxies to stand in for data we don't have, and try to use simplistic models to understand and predict human behavior, which is much too complicated to accurately model with just a handful of variables. Even worse, most of these algorithms are opaque, so the people impacted by these models are unable to challenge their outputs.

O'Neil demonstrates that the use of these types of models can have serious unforeseen consequences. Because WMDs are a cheap alternative to human review and decision-making, WMDs are more likely to be deployed in poor areas, and thus tend to have a larger impact on the poor and disadvantaged in our society. Additionally, WMDs can actually lead to worse behavior. In O'Neil's example of the Washington D.C. School District's model that used student test scores to identify and root out ineffective teachers, some teachers changed their students' test scores in order to protect their jobs. Although the WMD in this scenario was deployed to improve teacher effectiveness, it actually had the opposite effect by creating an unintended incentive structure.

The optional reading, "The Scored Society: Due Process for Automated Predictions," discusses algorithmic fairness in the credit scoring context. Like Cathy O'Neil, the authors contend that credit scoring algorithms exacerbate existing social inequalities and argue that our legal system has a duty to change that. They propose opening the credit scoring and credit sharing process to public review while also requiring that credit scoring companies educate individuals about how different variables influence their scores. By attacking the opacity problem that Cathy O'Neil identified as one of three characteristics of WMDs, the authors believe the credit scoring system can become more fair without infringing on intellectual property rights or requiring that we abandon the scoring models altogether.

Class Session 3: Diagnosing problems of interpretability

Zachary Lipton, an Assistant Professor at Carnegie Mellon University who is working intensively on defining and addressing problems of interpretability in machine learning, joined the class on Day 3 to discuss what it means for a model to be interpretable.

Class session three was our first day discussing interpretability, so both readings consider how best to define interpretability and why it is important. Lipton's paper asserts that interpretability reflects a number of different ideas and that its current definitions are often too simplistic. His paper primarily raises stage-setting questions: What is interpretability? In what contexts is interpretability most necessary? Does creating a model that is more transparent or can explain its outputs make it interpretable?

Through his examination of these questions, Lipton argues that the definition of interpretability depends on why we want a model to be interpretable. We might demand that a model be interpretable so that we can identify underlying biases and allow those affected by the algorithm to contest its outputs. We may also want an algorithm to be interpretable in order to provide more information to the humans involved in the decision, to give the algorithm more legitimacy, or to uncover possible causal relationships between variables that can then be tested further. By clarifying the different circumstances in which we demand interpretability, Lipton argues that we can get closer to a working definition of interpretability that better reflects its many facets.

Lipton also considers two types of proposals to improve interpretability: increasing transparency and providing post-hoc explanations. The increasing transparency approach can apply to the entire model (simulatability), meaning that a user should be able to reproduce the model's output if given the same input data and parameters. We can also improve transparency by making the different elements of the model (the input data, parameters, and calculations) individually interpretable, or by showing that during the training stage, the model will come to a unique solution regardless of the training dataset. However, as we will discuss further during the interventions stage of the course, providing more transparency at each level does not always make sense depending on the context and the type of model employed (for example a linear model vs. a neural network model). Additionally, improving the transparency of a model may decrease the model's accuracy and effectiveness. A second way to improve interpretability is to require post-hoc interpretability, meaning that the model must explain its decision-making process after generating an output. Post-hoc explanations can take the form of text, visuals, saliency maps, or analogies that show how a similar decision was reached in a similar context. Although post-hoc explanations can provide insight into how individuals affected by the model can challenge or change its outputs, Lipton cautions that these explanations can be unintentionally misleading, especially if they are influenced by our human biases.

Ultimately, Lipton's paper concludes that it is extremely challenging to define interpretability given how much it depends on external factors like context and the motivations for making a model interpretable. Without a working definition of the term it remains unclear how to determine whether a model is interpretable. While the Lipton paper focuses more on defining interpretability and considering why it is important, the optional reading, "Towards a rigorous Science of Interpretable Machine Learning," dives deeper into the various methods used to determine whether a model is interpretable. The authors define interpretability as the "ability to explain or present in understandable terms to a human" and are particularly concerned about the lack of standards for evaluating interpretability.

Class Session 4: Diagnosing vulnerabilities to adversarial examples

In our first session on adversarial examples, the class was joined by LabSix, a student-run AI research group at MIT that is doing cutting-edge work on adversarial techniques. LabSix gave a primer on adversarial examples and presented some of its own work.

The Gilmer et. al. paper is an accessible introduction to adversarial examples that defines them as "inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake." The main thrust of the paper is an examination of the different scenarios in which an attacker may employ adversarial examples. The authors develop a taxonomy to categorize these different types of attacks: "indistinguishable perturbation, content-preserving perturbation, non-suspicious input, content-constrained input, and unconstrained input." For each category of attack, the authors explore the different motivations and constraints of the attacker. By gaining a better understanding of the different types of attacks and the tradeoffs of each type, the authors argue that the designers of machine learning systems will be better able to defend against them.

The paper also includes an overview of the perturbation defense literature, which the authors criticize for failing to consider adversarial example attacks in plausible, real-world situations. For example, a common hypothetical situation posed in the defense literature is an attacker perturbing the image of a stop sign in an attempt to confuse a self-driving car. The Gilmer et. al. paper; however, points out that the engineers of the car would have considered and prepared for naturally occurring misclassification errors caused by the system itself or real world events (for example, the stop sign could be blown over by the wind). The authors also argue that there are likely easier, non technical methods that the attackers could use to confuse the car, so the hypothetical is not the most realistic test case. The authors' other main critique of the defense literature is that it does not acknowledge how improving certain aspects of a system's defense structure can make other aspects of the system less robust and thus more vulnerable to attack.

The recommended reading by Christian Szegedy et. al. is much more technical and requires some machine learning background to understand all of the terminology. Although it is a challenging read, we included it in the syllabus because it introduced the term "adversarial examples" and laid some of the foundation for research on this topic.



Credits

Figure and Notes by Samantha Bates

Syllabus by Samantha Bates, John Bowers and Natalie Saltiel

During the Long Hot Summer of 1967, race riots erupted across the United States. The 159 riots--or rebellions, depending on which side you took--were mostly clashes between the police and African Americans living in poor urban neighborhoods. The disrepair of these neighborhoods before the riots began and the difficulty in repairing them afterward was attributed to something called redlining, an insurance-company term for drawing a red line on a map around parts of a city deemed too risky to insure.

In an attempt to improve recovery from the riots and to address the role redlining may have played in them, President Lyndon Johnson created the President's National Advisory Panel on Insurance in Riot-Affected Areas in 1968. The report from the panel showed that once a minority community had been redlined, the red line established a feedback cycle that continued to drive inequity and deprive poor neighborhoods of financing and insurance coverage--redlining had contributed to creating poor economic conditions, which already affected these areas in the first place. There was a great deal of evidence at the time that insurance companies were engaging in overtly discriminatory practices, including redlining, while selling insurance to racial minorities, and would-be home- and business-owners were unable to get loans because financial institutions require insurance when making loans. Even before the riots, people there couldn't buy or build or improve or repair because they couldn't get financing.

Because of the panel's report, laws were enacted outlawing redlining and creating incentives for insurance companies to invest in developing inner-city neighborhoods. But redlining continued. To justify their discriminatory pricing or their refusal to sell insurance in urban centers, insurance companies developed sophisticated arguments about the statistical risks that certain neighborhoods presented.

The argument insurers used back then--that their job was purely technical and that it didn't involve moral judgments--is very reminiscent of the arguments made by some social network platforms today: That they are technical platforms running algorithms and should not be, and are not, involved in judging the content. Insurers argued that their job was to adhere to technical, mathematical, and market-based notions of fairness and accuracy and provide what was viewed--and is still viewed--as one of the most essential financial components of society. They argued that they were just doing their jobs. Second-order effects on society were really not their problem or their business.

Thus began the contentious career of the notion of "actuarial fairness," an idea that would spread in time far beyond the insurance industry into policing and paroling, education, and eventually AI, igniting fierce debates along the way over the push by our increasingly market-oriented society to define fairness in statistical and individualistic terms rather than relying on the morals and community standards used historically.

Risk spreading has been a central tenet of insurance for centuries. Risk classification has a shorter history. The notion of risk spreading is the idea that a community such as a church or village could pool its resources to help individuals when something unfortunate happened, spreading risk across the group--the principle of solidarity. Modern insurance began to assign a level of risk to an individual so that others in the pool with her had roughly the same level of risk--an individualistic approach. This approach protected individuals from carrying the expense of someone with a more risk-prone and costly profile. This individualistic approach became more prevalent after World War II, when the war on communism made anything that sounded too socialist unpopular. It also helped insurance companies compete in the market. By refining their risk classifications, companies could attract what they called "good risks." This saved them money on claims and forced competitors to take on more expensive-to-insure "bad risks."

(A research colleague of mine, Rodrigo Ochigame, who focuses on algorithmic fairness and actuarial politics, directed me to historian Caley Horan, who is working on an upcoming book titled Insurance Era: The Privatization of Security and Governance in the Postwar United States that will elaborate on many of the ideas in this article, which is based on her research.)

The original idea of risk spreading and the principle of solidarity was based on the notion that sharing risk bound people together, encouraging a spirit of mutual aid and interdependence. By the final decades of the 20th century, however, this vision had given way to the so-called actuarial fairness promoted by insurance companies to justify discrimination.

While discrimination was initially based on outright racist ideas and unfair stereotypes, insurance companies evolved and developed sophisticated-seeming calculations to show that their discrimination was "fair." Women should pay more for annuities because statistically they lived longer, and blacks should pay more for damage insurance when they lived in communities where crime and riots were likely to occur. While overt racism and bigotry still exist across American society, in insurance it has been integrated into and hidden from the public behind mathematics and statistics that are so difficult for nonexperts to understand that fighting back becomes nearly impossible.

By the late 1970s, women's activists had joined civil rights groups in challenging insurance redlining and risk-rating practices. These new insurance critics argued that the use of gender in insurance risk classification was a form of sex discrimination. Once again, insurers responded to these charges with statistics and mathematical models. Using gender to determine risk classification, they claimed, was fair; the statistics they used showed a strong correlation between gender and the outcomes they insured against.

And many critics of insurance inadvertently bought into the actuarial fairness argument. Civil rights and feminist activists in the late 20th century lost their battles with the insurance industry because they insisted on arguing about the accuracy of certain statistics or the validity of certain classifications rather than questioning whether actuarial fairness--an individualistic notion of market-driven pricing fairness--was a valid way of structuring a crucial and fundamental social institution like insurance in the first place.

But fairness and accuracy are not necessarily the same thing. For example, when Julia Angwin pointed out in her ProPublica report that risk scores used by the criminal justice system were biased against people of color, the company that sold the algorithmic risk score system argued that its scores were fair because they were accurate. The scores accurately predicted that people of color were more likely to reoffend. This likelihood of reoffense, called the recidivism rate, is the likelihood that someone recommits a crime after being released, and the rate is calculated primarily using arrest data. But this correlation contributes to discrimination, because using arrests as a proxy for recommitting a crime means the algorithm is codifying biases in arrests, such as a police officer bias to arrest more people of color or to patrol more heavily in poor neighborhoods. This risk of recidivism is used to set bail and determine sentencing and parole, and it informs predictive policing systems that direct police to neighborhoods likely to have more crime.

There are several obvious problems with this. If you believe the risk scores are accurate in predicting the future outcomes of a certain group of people, then it means it's "fair" that a person is more likely to spend more time in jail simply because they are black. This is actuarially "fair" but clearly not "fair" from a social, moral, or anti-discrimination perspective.

The other problem is that there are fewer arrests in rich neighborhoods, not because people there aren't smoking as much pot as in poor neighborhoods but because there is less policing. Obviously, one is more likely to be rearrested if one lives in an overpoliced neighborhood, and that creates a feedback loop--more arrests mean higher recidivism rates. In very much the same way that redlining in minority neighborhoods created a self-fulfilling prophecy of uninsurable communities, overpolicing and predictive policing may be "fair" and "accurate" in the short term, but the long-term effects on communities have been shown to be negative, creating self-fulfilling prophecies of poor, crime-ridden neighborhoods.

Angwin also showed in a recent ProPublica report that, despite regulations, insurance companies charge minority communities higher premiums than white communities, even when the risks are the same. The Spotlight team at The Boston Globe reported that the household median net worth in the Boston area was $247,500 for whites and $8 for nonimmigrant blacks--the result of redlining and unfair access to housing and financial services. So while redlining for insurance is not legal, when Amazon decides to provide Amazon Prime free same-day shipping to its "best" customers, it's effectively redlining--reinforcing the unfairness of the past in new and increasingly algorithmic ways.

Like the insurers, large tech firms and the computer science community also tend to frame "fairness" in a depoliticized, highly technical way involving only mathematics and code, which reinforces a circular logic. AI is trained to use the outcomes of discriminatory practices, like recidivism rates, to justify continuing practices such as incarceration or overpolicing that may contribute to the underlying causes of crime, such as poverty, difficulty getting jobs, or lack of education. We must create a system that requires long-term public accountability and understandability of the effects on society of policies developed using machines. The system should help us understand, rather than obscure, the impact of algorithms on society. We must provide a mechanism for civil society to be informed and engaged in the way in which algorithms are used, optimizations set, and data collected and interpreted.

The computer scientists of today are more sophisticated in many ways than the actuaries of yore, and they often sincerely are trying to build algorithms that are fair. The new literature on algorithmic fairness usually doesn't simply equate fairness with accuracy, but instead defines various trade-offs between fairness and accuracy. The problem is that fairness cannot be reduced to a simple self-contained mathematical definition--fairness is dynamic and social and not a statistical issue. It can never be fully achieved and must be constantly audited, adapted, and debated in a democracy. By merely relying on historical data and current definitions of fairness, we will lock in the accumulated unfairnesses of the past, and our algorithms and the products they support will always trail the norms, reflecting past norms rather than future ideals and slowing social progress rather than supporting it.