Many vendors provide “fairness audits” that claim to be able to measure when bias is taking place and recommend course corrections. In practice, many of these tools have only considered race and gender as dimensions of diversity. And if they are considering disability, the methodology behind it is likely flawed.
In 1978, the federal government adopted the Uniform Guidelines for Employee Selection Procedures to determine what constitutes a discriminatory employment test or personnel decision. It states “a selection rate for any race, sex, or ethnic group which is less than four-fifths (or 80%) of the rate for the group with the highest rate will generally be regarded by the Federal enforcement agencies as evidence of adverse impact, while a greater than four-fifths rate will generally not be regarded by Federal enforcement agencies as evidence of adverse impact.”
While not a definitive test, this “four-fifths rule” has traditionally been used as the primary method to determine adverse impact. It works by comparing the treatment of bounded identity groups with a default. But from the data perspective, there are no common bounded identifiers for people with disabilities. Disabilities are highly diverse and people with disabilities appear as statistical outliers in data analysis. Accuracy is further complicated by the fact that half of disabilities are invisible, and only 39% of employees with disabilities disclose to their managers.
Jutta Treviranus, Director of the Inclusive Design Research Centre at OCAD University in Toronto, Canada notes that reducing AI bias requires approaches rooted in the jagged starburst of human data—rather than simple bell curves.
The Jagged Starburst
If we were to take the preferences and requirements of any group of people and plot them on a multivariate scatterplot, it would look like a starburst.
About 80% of the dots fall in the middle, in a dense cluster covering 20% of the space. The remaining 20% of the dots are scattered in the 80% of remaining space.
- Dots in the middle are close together, meaning they’re very similar to each other.
- Dots away from the center are much further apart from each other, meaning they are more and more different.
Data science predictions will be highly accurate for the dots in the middle. But these predictions become inaccurate as for dots further from the middle. They will be plain wrong for the outliers at the edge—like people with disabilities. In fact, outliers are often removed entirely from datasets (a practice known as “cleaning”) because they don’t appear in numbers that are statistically relevant.
Designing for Inclusion
Designing for the 80% middle cluster is not just unfair and inaccurate. It’s also optimizing towards homogeneity and conformity—an action that is fundamentally at odds with diversity and inclusion goals
That’s also a bad plan for designing good systems generally. That outer edge is where we find innovation, not the complacent middle. AI research and data analytics should work to discover the diverse range, which is located on the edge or the periphery.
Systems that value the edge of our human scatterplot:
- Adapt to change & respond to the unexpected
- Detect risk
- Transfer to new contexts
- Results in greater dynamic resilience & longevity
- Will reduce disparity
To learn more about why data systems can’t recognize or understand people with disabilities through traditional statistical modeling methods, listen to the podcast episode How Artificial Intelligence Creates Discrimination in #HR & #Recruiting with Dr. Treviranus.
Researchers like Dr. Treviranus are working on new methodologies to create inclusive datasets. But at this time, employers can’t use traditional statistical auditing effectively to know if they are discriminating against people with disabilities. Removing obvious data sources, such as demographic data, is insufficient.