
In 2024, Harvard Medical School published an article titled “The Benefits of the Latest AI Technologies for Patients and Clinicians”, it outlined several benefits of AI in healthcare including that: AI can help clinicians better interpret imaging results; AI can help health care organizations improve quality and safety; and AI can aid in the diagnosis and treatment of rare diseases.
However, AI systems are only as good as the data they're trained on and the algorithms that power them. When these foundational elements contain biases the resulting healthcare decisions can cause inequities and harm to vulnerable populations.
Recent research published by the NIH in "Bias in medical AI: Implications for clinical decision-making" states that "Biased medical AI can lead to substandard clinical decisions and the perpetuation and exacerbation of longstanding healthcare disparities." The article by the NIH defines bias in medical AI as "any instance, factor, or prejudice that drives an AI algorithm to produce differential or inequitable outputs and outcomes."
Understanding algorithmic bias in healthcare
According to research published in Boston University's Journal of the CAS Writing Program, titled: “AI In Healthcare: Counteracting Algorithmic Bias”, algorithmic bias can be defined as "inequality of algorithmic outcomes between two groups of different morally relevant reference classes such as gender, race, or ethnicity. Algorithmic bias occurs when the outcome of the algorithm's decision-making treats one group better or worse without good cause.” This definition shows the ethical concern of biased healthcare AI.
The journal article further explains that "inequities, disparities, and discrimination in patient care, treatment, and health outcomes are rampant in the current healthcare system, and biased AI algorithms have the potential to exacerbate these problems.” This risk of existing disparities makes addressing algorithmic bias a priority.
These biases have deep historical roots. As Bias in medical AI explains, "The underlying disparities in healthcare that drive bias in medical AI are not recent developments. Rather, they are rooted in long standing historical driving forces of inequality in health systems, which themselves reflect even longer-standing discrimination and other forms of structural oppression."
Algorithmic bias in healthcare occurs when AI systems produce systematically skewed outputs that unfairly advantage or disadvantage certain patient groups. These biases typically stem from three primary sources:
1. Biased training data
AI algorithms learn patterns from historical healthcare data, which often reflects existing societal inequities. When training data lacks diversity or contains historical biases, these patterns become encoded in the algorithm's behavior.
One of the primary causes of bias relates to training data representation. The Bias in medical AI article notes that "When an algorithm is trained on imbalanced data, this can lead to worse performance and algorithm underestimation for underrepresented groups." This creates a limitation where "Without statistically (or clinically) meaningful predictions for certain groups, any downstream clinical benefits of the AI model are limited to only the (largest) groups with sufficient data sizes."
As the Boston University research notes, "The World Health Organization reports that social determinants of health such as education, employment, food security, and income can account for up to 55% of health outcomes. Pervasive bias and inequity can arise when these social determinant variables are included in AI tools because algorithms function by finding correlations among variables to generate predictions."
The Boston University research identifies specific types of data-related bias, including "minority bias [which] occurs when minority groups are under or overrepresented in a dataset" and "missing data bias [which] occurs when data is missing from a dataset in a nonrandom way.”
2. Problematic algorithm design
The technical design of healthcare algorithms can introduce bias through multiple mechanisms:
- Proxy variables: When algorithms use variables that correlate with protected characteristics (like zip code as a proxy for race), they can reproduce discriminatory patterns without explicitly considering protected attributes. The Boston University research explains, "Due to historic discrimination, zip code can be used as an accurate predictor of a person's race. If zip code is included as a classifier in a training dataset, but race is not, discrimination can be unintentionally built into an algorithm.”
- Label bias: The Deerfield Journal describes this as bias that "refers to bias that leads to inaccurate decisions due to ill-defined labels. AI algorithms use classifiers and labels to draw correlations and make predictions. If labels, classifiers, and parameters are not used appropriately, AI can generate biased outcomes.”
- Technical bias: Boston University research defines this as "bias that occurs when the features of detection are not as reliable for some groups as they are for others... A familiar example of technical bias is that melanoma is harder to detect in darker skin than it is in lighter skin because it is easier to recognize discoloration in fair skin."
- Optimization goals: Algorithms optimized for cost efficiency rather than health equity may recommend fewer resources for historically underserved populations with more healthcare needs.
- Feedback loops: When algorithms influence clinical decisions, the resulting data becomes part of future training sets, potentially amplifying existing biases over time.
3. Implementation context
Even well-designed algorithms can produce biased outcomes depending on how they're deployed:
- Interpretation Bias: Clinicians may interpret algorithmic recommendations differently based on patient demographics, reinforcing existing biases.
- Access Disparities: Advanced AI tools may be less available in resource-limited settings that serve marginalized communities.
- Algorithmic Authority: Excessive deference to AI systems without evaluation can amplify algorithmic biases, particularly when recommendations align with clinician preconceptions.
The Bias in medical AI research notes an additional dimension of implementation bias: "How end users interact with deployed solutions can introduce bias. If doctors are more likely to follow AI advice for certain patient groups but override it for others, this could lead to inequitable application of the AI system."
Progress in addressing healthcare AI bias
Several initiatives demonstrate approaches to mitigating algorithmic bias:
The Equitable AI Research Roundtable (EARR)
The Equitable AI Research Roundtable is a collaborative initiative that brings together experts from different sectors, including law, education, community engagement, social justice, and technology. EARR aims to provide research-based perspectives and feedback on the ethical and social implications of emerging technologies, particularly focusing on fairness in AI systems. While specific standardized fairness metrics developed by EARR are not detailed in the available literature, the initiative emphasizes the importance of interdisciplinary collaboration to address fairness in AI.
Fairness-aware algorithm redesign
Historically, equations estimating kidney function, such as the estimated glomerular filtration rate (eGFR), included race as a variable, often leading to the overestimation of kidney function in Black patients. This practice potentially delayed their eligibility for kidney transplants. In response, the National Kidney Foundation and the American Society of Nephrology recommended the adoption of race-neutral eGFR equations.
Community-centered AI development
To ensure that AI tools in healthcare address the needs and concerns of diverse populations, some healthcare systems have established ethical review boards that include patient representatives. These boards oversee the development and deployment of AI tools, ensuring they align with patient needs and ethical standards. By involving community members in the evaluation process, these initiatives aim to enhance the fairness and trustworthiness of AI systems in healthcare settings.
Regulatory and policy landscape
The growing recognition of algorithmic bias in healthcare has prompted regulatory bodies and policymakers to develop frameworks addressing the ethical use of AI in clinical settings.
FDA's regulatory framework for AI/ML-based medical devices
The U.S. Food and Drug Administration (FDA) has been developing a regulatory framework specifically tailored to AI/ML-based medical devices. The FDA's "Artificial Intelligence and Machine Learning in Software as a Medical Device" guidance outlines:
- Requirements for transparency in algorithm development
- Standards for clinical validation across diverse populations
- Protocols for monitoring and reporting algorithmic performance post-deployment
- Expectations for documenting training data demographics and limitations
The FDA has emphasized the importance of "predetermined change control plans" that describe anticipated modifications to algorithms and appropriate validation methods, ensuring that updates don't introduce or amplify biases.
International harmonization efforts
Globally, organizations such as the International Medical Device Regulators Forum (IMDRF) are working to harmonize regulatory approaches to AI in healthcare. These efforts focus on:
- Establishing common terminology and definitions related to AI bias
- Developing standardized reporting requirements for algorithm validation across demographic groups
- Creating frameworks for continuous monitoring of AI performance across diverse populations
Institutional ethics guidelines
Leading healthcare institutions and professional societies have developed guidelines specifically addressing fairness in healthcare AI:
- The American Medical Association's "Augmented Intelligence in Health Care" policy emphasizes that AI systems should be designed to "promote well-being, minimize harm, and ensure that the benefits and burdens of these systems are distributed fairly"
- The World Health Organization's "Ethics and Governance of Artificial Intelligence for Health" guidance provides recommendations for ensuring equitable access to AI benefits across populations
Data Privacy Considerations
Regulatory frameworks also address the intersection of data privacy and algorithmic fairness:
- Health Insurance Portability and Accountability Act (HIPAA) compliance remains important when deploying AI systems using patient data
- The European Union's General Data Protection Regulation (GDPR) includes provisions on algorithmic transparency and the "right to explanation" for automated decisions
Best practices for bias mitigation
Healthcare organizations implementing AI systems can adopt practices to mitigate algorithmic bias:
Data governance
Establishing data governance practices is important to addressing algorithmic bias:
- Implement data quality frameworks that explicitly assess demographic representation
- Document data limitations, including known gaps in population coverage
- Create data collection protocols that prioritize diversity and representation
- Develop standardized processes for identifying and addressing potential biases in historical data
Algorithmic transparency requirements
Healthcare organizations should establish transparency requirements for all deployed AI systems:
- Mandate documentation of algorithm development processes, including training data characteristics
- Require disclosure of performance metrics stratified by relevant demographic groups
- Establish processes for explaining algorithmic recommendations to both clinicians and patients
- Create accessible documentation of known limitations and potential biases
Continuous monitoring and validation
Post-deployment monitoring is important for identifying emergent biases:
- Implement ongoing performance monitoring across demographic subgroups
- Establish clear thresholds for performance disparities that trigger remediation
- Develop processes for investigating unexpected outcomes or patterns that may indicate bias
- Create feedback channels for clinicians to report concerns about algorithmic recommendations
Multi-stakeholder review processes
Involving diverse perspectives in algorithm evaluation can identify potential biases that developers might overlook:
- Establish review boards that include clinical specialists, ethics experts, and patient representatives
- Incorporate structured evaluation of potential bias impacts in all algorithm approval processes
- Implement pre-release testing with clinicians from diverse backgrounds and practice settings
- Create pathways for ongoing refinement based on real-world implementation feedback
FAQs
How can healthcare providers detect algorithmic bias?
Bias can be detected through continuous performance audits, diverse data testing, and real-world outcome analysis.
What impact can biased AI have on healthcare costs?
Biased AI can increase healthcare costs by leading to misdiagnosis, unnecessary treatments, and prolonged care needs.
Are there standardized benchmarks for AI fairness in healthcare?
Some organizations, like the World Health Organization, have proposed guidelines, but comprehensive benchmarks are still evolving.
How can healthcare AI systems be made more transparent?
Transparency can be achieved through clear documentation of training data, algorithmic design, and decision-making logic.
What is the impact of AI bias on mental health care?
AI bias can exacerbate disparities in mental health diagnosis and treatment, especially among marginalized groups.
Subscribe to Paubox Weekly
Every Friday we'll bring you the most important news from Paubox. Our aim is to make you smarter, faster.