A recent paper on algorithm bias in hiring, “Closing the GAP: Group-Aware Parallelization for Online Selection of Candidates with Biased Evaluation,” by Jad Salem and Swati Gupta at Georgia Tech, highlighted an approach to address structural bias that is built into the use of computer-mediated hiring processes.
Before addressing the problem, it is important to contextualize the issue of hiring bias within its historical framework. Much of the law of hiring is designed to stop explicit practices to bar women and minorities from the workforce. (For more, see this helpful Findlaw Guide.) Beyond explicit bars on discrimination over race, religion, sexual preference, age, disability, and citizenship, however, there is little the law can do to subtle or implicit bias in choices about the quality of job applicants. For example, the Salem and Gupta paper describes a study in which an identical resume used the name “John” for one set of potential employers and “Jennifer” for another set of potential employers. “In particular, hireability was measured on a five-point scale; John’s average score was 3.8, and Jennifer’s average score was 2.9.” Since the resumes for John and Jennifer were identical, the reduction came from the only piece of information that was different – the name and what it meant to the reader.
Here is another example: Researchers in a 2003 study “sent resumes with either African-American- or white-sounding names and then measured the number of callbacks each resume received for interviews.” Non-white names on the resumes required fifty percent more applications to get an interview than white sounding names.
Here is a third example: In a change made to the hiring process for orchestras, the use of screens during candidate auditions significantly increased the number of women hired for classical orchestras. The number increased even further once the process included the removal of shoes, so that clicking of heels across the stage by female candidates no longer revealed gender.
Given these examples of the pervasive nature of bias by humans in the hiring process, the use of algorithms might serve as a great solution. And although algorithms may help, there are too many clues that humans use to infer race and gender (like the clicking of heels) to make these systems work automatically.
One clear example is based on language and word choice. As Vox Reports:
[T]wo new studies show that AI trained to identify hate speech may actually end up amplifying racial bias. In one study, researchers found that leading AI models for processing hate speech were one-and-a-half times more likely to flag tweets as offensive or hateful when they were written by African Americans, and 2.2 times more likely to flag tweets written in African American English (which is commonly spoken by black people in the US). Another study found similar widespread evidence of racial bias against black speech in five widely used academic data sets for studying hate speech that totaled around 155,800 Twitter posts.
In Weapons of Math Destruction, Kathy O’Neill gives numerous anecdotes of individuals being harmed by the systematic nature of these processes. She does not, unfortunately, address the bias and problematic nature of the non-computer models, but she correctly flags the concern that if these algorithms are not managed carefully, then direct and indirect bias can permeate these systems.
One fear is that algorithms will be adopted and scaled just as Facebook and Google have scaled to dominate their respective fields. Whatever the algorithm uses to rate a person high or low will be replicated across multiple employers. If a single company comes to dominate the hiring field, then its algorithm will define the likelihood of success or failure for an individual.
The algorithms are already pervading the hiring process. To continue using these systems, both the companies offering these systems and the companies subscribing to these services need to test and retest to be sure the software isn’t looking for hidden clues that reflect the pre-existing bias built into society: language choice, choice of neighborhoods, zip codes, volunteer activities, names, or the myriad of other, seemingly neutral factors, that serve as surrogates for gender, race, and other potential areas for discrimination.