Maybe you've heard that New York City has been using algorithms to improve our lives.

Like preventing fires by predicting and inspecting high-risk buildings.

Or reworking high school admissions to more effectively match students.

Or maybe you've heard that they're making people's lives worse.

Like targeting Black and Latinx communities with stop and frisk.

Or setting bail and sentencing using an algorithm where the results were biased against Black people.

hover for more information!

These are all examples of automated decision systems (ADSs) — processes that rely on computerized components to make or influence a decision.

We want to empower New Yorkers to advocate for ADSs that work to undo unjust systems instead of encoding inequality.

1 Why Do ADSs Matter?

Just ask Porfirio Mejia.*

Since 2012, Porfirio's been running 128 P&L Deli Grocery, a bodega in Washington Heights.

Locals know they can count on him for groceries, even when times are tough. Just as important, the bodega serves as a space for building community — a place where neighbors go to hang out, watch baseball, and smoke cigars.

Like many of his neighbors, Porfirio is a Dominican immigrant. He works with an anti-hunger advocacy group and knows the importance of the Supplemental Nutrition Assistance Program (SNAP), also known as food stamps. More than half of his customers bought their groceries with SNAP benefits from the US Department of Agriculture (USDA).

But when their benefits hadn't come in yet, he let patrons take groceries home trusting that they would come back later to settle their bills, like an informal IOU system. This IOU system allowed Porfirio to help members of his community feel secure about their food even if they felt insecure about their finances.

*Original reporting by The Intercept and and The New Food Economy

In 2018, a computer program almost put his bodega out of business

Porfirio got a notice from the USDA that his establishment was disqualified from accepting SNAP on suspicion of food stamp fraud. Earlier that year, the City, coordinating with the USDA, started using a computer program to find cases of trading cash for food stamps.

While the computer program hasn't been made public, it's likely that Porfirio's IOU system triggered the fraud suspicion.

The automated decision system (ADS) looked like this

                if single item purchase > $100 

                and multiple such purchases 

                then flag for fraud

1. Sales Data

Sales data gets tracked on food stamp cards like electronic benefit transfer (EBT) cards and sent to the USDA.

2. Fraud Detection

A computer program uses an algorithm and sales data to flag potential fraud. The algorithm used by the program may have noticed large one-time purchases made at P&L Deli and mistakenly assumed it was a case of 'cash for food stamps' fraud.

3. Fraud Notice

Then, USDA staff sent a fraud notice to Porfirio.

Porfirio is struggling to keep P&L Deli going.

His sales have dropped by 30%.

The USDA told Porfirio to show them itemized receipts to prove his innocence, but Porfirio's registers were only able to print total sales figures. The letters his customers sent in as proof were deemed insufficient evidence.

Porfirio has called the USDA, but no one has been able to reverse the decision made by the algorithm.

Using a flawed algorithm, this automated decision system almost destroyed Porfirio's livelihood and hurt people who really needed food.

Why should we worry about this?

P&L Deli wasn't alone: the majority of businesses impacted by this new system have been in low-income neighborhoods like Porfirio's.

ADSs seem to keep expanding into more areas of our lives. From policing to school assignments, the decisions made by ADSs can have far-reaching consequences for all of us.

Governments tell us ADSs are being used to increase efficiency and improve service-delivery. But, as Porfirio's experience shows, that might have unintended harmful consequences.

But completely human systems also mess up, right?

Definitely. Just look back to the history of housing segregation and redlining.

Even though human-centric systems also make mistakes, ADSs are unique in the risk they pose. At the same time, if ADSs are created thoughtfully, they can be powerful tools to improve our lives and society.

ADSs can work at a more rapid pace and at a larger scale than human decision processes. This could accelerate injustices.
When they make mistakes, the ADS might not have human checks to override or course-correct them.
ADSs can be used to increase policing and criminalization of marginalized communities.
Implementing ADSs are often used to justify collecting a lot more data about people, breaching personal privacy.

The scale and speed of ADSs could be used to bring benefits to the communities that most need them when government resources and time are limited.
ADSs could build in more transparency than human systems so we know exactly how a decision is being made.
Since ADSs are using a lot more data, they might be able to see patterns that we can't see on our own, helping us uncover our own biases.

ADSs might be able to help us in poweful ways, but they can also make problems like racism, discrimination, surveillance, and inequality much worse, much faster.

2 What are Algorithms?

They're just instructions.

Like in Porfirio's case, all ADSs use computer programs that are made up of algorithms — detailed sets of automated, computerized instructions. These algorithms work together to complete a task.

Just think about how you buy an avocado. What factors do you consider?

I want one that's affordable.

Hopefully it's organic, but less than $1.50.

I'm going to make some guacamole to eat tonight,
so I want the ripest one I can find.

Hover over the avocados to find the best one.

Best avocado cut in half jumping up and down

All of these calculations you're doing in your head are an algorithm.

You’ve taken some variables (the price, how ripe it is, and is it organic or not), given those variables weights (buying a cheap, ripe avocado is more important than an organic avocado), to analyze a set of data points (the pile of avocados) and reach a decision (which avocado should you buy).

What kind of algorithms do ADSs use?

It depends. They can sometimes be a bit more complicated than the ones we use in our everyday decisions. Some ADSs involve multiple algorithms, hundreds of thousands of data points, and variables that are given different weights.

But really, they all use the same building blocks for the instructions — variables, weights, and data points.

In some cases, people have control over what steps the algorithm takes. People like you decide which variables are the most important in buying avocados or predicting that a place is committing food stamp fraud.

In other cases, data is given to a computer, the computer finds patterns in the data, and then decides which variables are important based on the patterns it finds.

Two avocado halves with the pits still inside

When a computer finds patterns on its own, rather than being told which variables are important, the set of instructions it uses is called a machine learning algorithm. They are often less transparent because we don't always know why the computer places more or less importance on different variables.

Can I play with an algorithm?

We created this algorithm to help an imaginary fire department predict which buildings are at high risk for fire.

All of the boxes represent real buildings. Hover over them to see building characteristics. Click on different combinations of variables in the bubbles below to add or remove them from the algorithm.

Once the predicted fire risk passes a threshold of 30% fire risk, the building will appear with a flame to alert the fire department to inspect it!

*Created using data from the NYC Open Data Portal.

Hmm! Doesn't look like building age changed much. It has a low weight in the algorithm.

Interesting! Property value has a negative correlation. This means that the more valuable a property is, the less likely it is to catch fire.

Whoa! Height has a big impact. Looks like that variable has a high weight in the algorithm.

Seems like buildings in the Bronx have a higher risk than buildings in other boroughs. This could be because of other variables we didn't include in the dataset. Maybe buildings in the Bronx have a certain architectural style that is riskier. This is called ommitted variable bias.

Bigger buildings are more likely to catch on fire. That makes sense!

More people means greater risk. Fun fact! We didn't have an exact count of the number of people living in the buildings. So we used the number of residential units as a proxy variable.

Businesses have a higher risk of fire. There could be a variable we're missing that is correlated to businesses. Maybe a lot of these businesses are restaurants where kitchen fires can happen.

Algorithms are always influenced by humans because we decide what data to use and, often, which variables are important.

3 What Makes ADSs Good or Bad?

A good ADS addresses unjust systems

Every ADS reflects a set of values, whether they’re stated or not. We think that in an unequal and unjust world, the ADSs we build should strive for equity. If they don’t, they risk worsening the unjust systems that exist.

What’s an unjust system?

An unjust system is a set of political, cultural, and economic conditions that perpetuate inequality.

For example, think of access to credit and financial resources.

Poor communities and communities of color often have limited access to credit. This is the legacy of policies like redlining that prevented people who lived in non-White and immigrant neighborhoods from getting mortgage loans. The racial and class prejudices of bankers have led to discrimination against groups that they stereotyped as less reliable. If you look around, you’ll notice that, even today, the kinds of lenders in communities of color are often high-fee check-cashing services rather than banks.

All of these conditions create a system that reinforces poverty and prevents people from building wealth.

How can ADSs address these systems?

We can start with the ADS’s purpose — what it’s intended to accomplish.

At a minimum, an ADS’s purpose should include actively understanding the unjust systems its operating within. At best, ADSs will actively work to undo the injustice.

In Porfirio’s case, the USDA’s ADS failed to account for his customers’ lack of access to credit and the need for an informal IOU system. If the USDA had understood the unjust system it was operating within, it could have designed its ADS to distinguish between real fraud and Porfirio’s method for extending credit.

Even better, if the USDA sought to undo injustice, it could design the ADS to target the conditions that lead to poor credit access. The ADS might identify areas to increase SNAP benefits, make them more flexible, or use even bolder solutions.

Isn’t increasing efficiency a good enough purpose?

Nope.

Most government ADSs are intended to increase efficiency. For example, the USDA’s ADS was meant to efficiently catch SNAP (food stamp) fraud. A computer is faster than a human at looking through thousands of pages of financial records.

But efficiency alone is not a good purpose. Efficiency speeds up processes, but if that process is already creating inequality, speeding it up makes things worse.

ADSs are a unique opportunity to acknowledge and address the assumptions and existing decision-making processes that might perpetuate an unjust system.

We have a good purpose, now what?

If the purpose of the ADS is good because it understands or better yet undoes unjust systems, we need to make sure that we design the ADS to accomplish this purpose.

There are five design decisions we should think about:
impact bias explainability automation and flexibility.

Impact

Who is affected by the ADS and how?

Who the ADS affects and how it affects them helps us evaluate the potential harm.

Scale

How many people might be affected by the ADS?

Scope

How deep of an impact will the decision have on affected people? There's deep impact when a decision affects something important. For example, knowing which building might catch on fire is a matter of life and death.

Shallow impact is when a decision affects something inessential to life or when the impact is not very severe.

Vulnerability

How vulnerable are the groups that the ADS impacts? If an ADS impacts a group that has historically faced discrimination and removes resources from that group, the ADS is probably going to worsen inequality.

Bias

What inputs in the ADS create problems?

Bias in the data or algorithm might worsen unjust systems.

Data

Data is never objective. We embed our human bias when we collect, store, and use data. All data is biased; it may contain errors or reflect existing inequalities in the world or both.

When data is biased, it might tell us to do something that's not purely objective. It might tell us to do something based on the existing biases of the world that it's capturing.

For example, if we use arrest data by neighborhood to distribute the police force, we may send police to the same neighborhoods that they have historically policed, which leads to even more collection of arrest data in those same neighborhoods. This might then be used to justify sending even more police to that neighborhood. This creates a vicious cycle.

These biases can arise because all datasets, to some extent, are incomplete. It is impossible to capture the world absolutely and perfectly. The best datasets try to be as complete as they need to be for the purpose and acknowledge their bias.

Algorithmic Model

The type of algorithm an ADS uses and the way it handles the variables and data points matter. Choosing the best algorithm for the problem depends on the ADS's purpose. There are hundreds of different algorithms that each calculate outputs slightly differently. There will always be trade-offs in deciding what model to use, and we must think carefully about each case and the potential harms.

If humans are determining specific components of the algorithm, this can also create bias. For many algorithms, humans will be determining the outcome of interest and the variables. These decisions are subjective because they are based on our own values and world views. For example, if you don't know that geographic area is highly linked to race and socioeconomic status, you might accidentally bias the algorithm by including this variable.

Explainability

How easy is it for humans to understand how the ADS works?

When an ADS is explainable, it's easier to pinpoint where problems arise.

Transparency

Before we can even begin to understand how an ADS works, we need to be able to find information about it. To be completely transparent, an ADS's algorithms and data (as long as privacy concerns are met) should be made publicly available with documentation explaining them.

Ease of Understanding

Some algorithms are very easy to understand, like instructions for choosing an avocado. We know exactly which variables are going into the model and how they're weighted.

The avocado algorithm is a decision tree. Super explainable.

Others can feel like a black box. We give the data to the computer and it gives us a decision, but we don't really know how it reached that decision.

A neural network algorithm is not that explainable.

Automation

How much of the decision is made by the algorithm?

If the decision is made mostly automatically, it could worsen unjust systems without humans even knowing.

Low Automation

The ADS doesn't make a decision. It simply analyzes the data which is then used to inform a human decision.

Imagine a dashboard that tells you what percentage of reported potholes have been filled this month without suggesting a specific next action.

Medium Automation

The algorithm provides an advisory decision, but a human still looks over the decision and decides whether to implement it.

Imagine a scoring system that prioritizes buildings to inspect for fire risk.

High Automation

An algorithm makes a decision and the people using it simply implement that decision.

Imagine a traffic light system that sends you a ticket when it detects that you ran a red light.

Flexibility

Can the ADS be changed easily if people have feedback?

If there is a problem with the ADS, we want to be able to fix it.

Feedback Mechanisms

The best feedback is from people directly affected by the ADS or people who use the ADS on a regular basis.

Feedback could be in the form of an appeals mechanism, a complaint system, or something else.

Ability and Access to Update

While feedback is important, what a government can do with the feedback is limited based on whether or not they have the access and skills to update the algorithm.

This depends in part on who owns the ADS. Is it the government or a private company? Does the government have the in-house talent to update the algorithm?