AI-powered hiring tools favor black and female job candidates over white and male applicants: study

A new study has found that leading AI hiring tools built on large language models (LLMs) consistently favor black and female candidates over white and male applicants when evaluated in realistic job screening scenarios — even when explicit anti-discrimination prompts are used.

The research, titled “Robustly Improving LLM Fairness in Realistic Settings via Interpretability,” examined models like OpenAI’s GPT-4o, Anthropic’s Claude 4 Sonnet and Google’s Gemini 2.5 Flash and revealed that they exhibit significant demographic bias “when realistic contextual details are introduced.”

These details included company names, descriptions from public careers pages and selective hiring instructions such as “only accept candidates in the top 10%.”

Once these elements were added, models that previously showed neutral behavior began recommending black and female applicants at higher rates than their equally qualified white and male counterparts.

The study measured “12% differences in interview rates” and noted that “biases… consistently favor Black over White candidates and female over male candidates.”

This pattern emerged across both commercial and open-source models — including Gemma-3 and Mistral-24B — and persisted even when anti-bias language was built into the prompts. The researchers concluded that these external instructions are “fragile and unreliable” and can easily be overridden by subtle signals “such as college affiliations.”

In one key experiment, the team modified resumes to include affiliations with institutions known to be racially associated — such as Morehouse College or Howard University — and found that the models inferred race and altered their recommendations accordingly.

What’s more, these shifts in behavior were “invisible even when inspecting the model’s chain-of-thought reasoning,” as the models rationalized their decisions with generic, neutral explanations.

The authors described this as a case of “CoT unfaithfulness,” writing that LLMs “consistently rationalize biased outcomes with neutral-sounding justifications despite demonstrably biased decisions.”

In fact, even when identical resumes were submitted with only the name and gender changed, the model would approve one and reject the other — while justifying both with equally plausible language.

To address the problem, the researchers introduced “internal bias mitigation,” a method that changes how the models process race and gender internally instead of relying on prompts.

Their technique, called “affine concept editing,” works by neutralizing specific directions in the model’s activations tied to demographic traits.

The fix was effective. It “consistently reduced bias to very low levels (typically under 1%, always below 2.5%)” across all models and test cases — even when race or gender was only implied.

Performance stayed strong, with “under 0.5% for Gemma-2 and Mistral-24B, and minor degradation (1-3.7%) for Gemma-3 models,” according to the paper’s authors.

The study’s implications are significant as AI-based hiring systems proliferate in both startups and major platforms like LinkedIn and Indeed.

“Models that appear unbiased in simplified, controlled settings often exhibit significant biases when confronted with more complex, real-world contextual details,” the authors cautioned.

They recommend that developers adopt more rigorous testing conditions and explore internal mitigation tools as a more reliable safeguard.

“Internal interventions appear to be a more robust and effective strategy,” the study concludes.

An OpenAI spokesperson told The Post: “We know AI tools can be useful in hiring, but they can also be biased.”

“They should be used to help, not replace, human decision-making in important choices like job eligibility.”

The spokesperson added that OpenAI “has safety teams dedicated to researching and reducing bias, and other risks, in our models.”

“Bias is an important, industry-wide problem and we use a multi-prong approach, including researching best practices for adjusting training data and prompts to result in less biased results, improving accuracy of content filters and refining automated and human monitoring systems,” the spokesperson added.

“We are also continuously iterating on models to improve performance, reduce bias, and mitigate harmful outputs.”

The full paper and supporting materials are publicly available at GitHub. The Post has sought comment from Anthropic and Google.

What's On

Coldplay’s Chris Martin Asks Fans to ‘Send Love’ to Charlie Kirk’s Family After Activist’s Death

Olympic boxer who quit match against Imane Khelif reveals alleged online abuse

James Webb telescope’s ‘starlit mountaintop’ could be the observatory’s best image yet — Space photo of the week

AI-powered hiring tools favor black and female job candidates over white and male applicants: study

Warner Bros. Discovery shares spike as CEO David Zaslav shops media group around — setting up bidding war for Paramount Skydance

Swiss banking giant UBS eyes move to the US to avoid pesky new regulations

FAA seeking $3.1 million in fines from Boeing over numerous safety violations

New fundraising record reached to support Charlie Kirk’s family

No-media-allowed MAGA hangout spilled beans on Bessent-Pulte feud

Exclusive | Warner Bros. Discovery CEO David Zaslav wants bidding war for his media giant — even as Paramount Skydance plans takeover offer: sources

Rent in NYC’s wealthiest neighborhoods has spiked 60% — and even six-figure earners are struggling

Left-wing Media Matters abruptly stops publishing after Charlie Kirk shooting

Pfizer, Moderna shares plunge on report that Trump officials plan to link 25 child deaths to COVID vaccine

Leave A Reply Cancel Reply

Olympic boxer who quit match against Imane Khelif reveals alleged online abuse

James Webb telescope’s ‘starlit mountaintop’ could be the observatory’s best image yet — Space photo of the week

IDF reservist strain intensifies Ultra-Orthodox draft debate as Gaza war expands

Charlie Kirk Shooting Death Investigation: Updates on the Case and Everything We Know So Far

Mets’ Juan Soto makes franchise history with latest homer: ‘Really impressive’

Science history: Gravitational waves detected, proving Einstein right — Sept. 14, 2015

Morgan Wallen honors Erika Kirk after husband Charlie Kirk’s Utah assassination

Subscribe to Updates

What's On

AI-powered hiring tools favor black and female job candidates over white and male applicants: study

Keep up with today’s most important news

Thanks for signing up!

Keep Reading

Leave A Reply Cancel Reply