๐Ÿš€ New Holistic AI Tracker 2.0 โ€“ Now featuring Atlas, a heat-mapped world view tracking AI legislation, incidents, and regulations in real time.
Register for free
โ†’
Learn more about EU AI Act
Join Webinar: Bias Detection in Large Language Models - Techniques and Best Practices
Register Now
โ†’
Learn more about EU AI Act

How to Mitigate AI Bias: Comparing Business and Medicine

Authored by
Luca Santinelli
Solutions Engineer at Holistic AI
Published on
Oct 17, 2024
read time
0
min read
share this
How to Mitigate AI Bias: Comparing Business and Medicine

Artificial intelligence (AI) is increasingly being used in both traditional business sectors and in medical diagnostics. But there are discrepancies in the ways the respective fields treat the issue of bias when it comes to 'protected attributes', such as age, race and gender.

This blog explores the differences between how the two fields should approach bias detection and mitigation from a technical standpoint.

Detecting and mitigating bias in business: The immutable output paradigm

In mainstream business, the overarching goal is often to reduce the influence of protected attributes. Letโ€™s take an example from credit scoring, which is a potentially controversial use case.

Ideally, if Jane (female) and John (male) both have the same income, credit history, and financial behaviour, their credit scores should be almost identical. While credit scores are commonly continuous, or have a range of possible variables, the application of credit score typically results in a binary outcome, such as whether individuals qualify for a loan based on a specific score threshold. This dichotomy enables a clearer evaluation of โ€˜favourableโ€™ versus โ€˜unfavourableโ€™ outcomes, providing a more structured framework for detecting bias in terms of equal treatment or equal outcomes.

Calculation: To detect bias, analysts often compute the Disparate Impact Ratio:


๐‘…๐‘Ž๐‘ก๐‘’โ€ˆ๐‘œ๐‘“โ€ˆ๐‘“๐‘Ž๐‘ฃ๐‘œ๐‘Ÿ๐‘Ž๐‘๐‘™๐‘’โ€ˆ๐‘œ๐‘ข๐‘ก๐‘๐‘œ๐‘š๐‘’โ€ˆ๐‘“๐‘œ๐‘Ÿโ€ˆ๐‘๐‘Ÿ๐‘œ๐‘ก๐‘’๐‘๐‘ก๐‘’๐‘‘โ€ˆ๐‘”๐‘Ÿ๐‘œ๐‘ข๐‘

๐‘…๐‘Ž๐‘ก๐‘’โ€ˆ๐‘œ๐‘“โ€ˆ๐‘“๐‘Ž๐‘ฃ๐‘œ๐‘Ÿ๐‘Ž๐‘๐‘™๐‘’โ€ˆ๐‘œ๐‘ข๐‘ก๐‘๐‘œ๐‘š๐‘’โ€ˆ๐‘“๐‘œ๐‘Ÿโ€ˆ๐‘›๐‘œ๐‘›โˆ’๐‘๐‘Ÿ๐‘œ๐‘ก๐‘’๐‘๐‘ก๐‘’๐‘‘โ€ˆ๐‘”๐‘Ÿ๐‘œ๐‘ข๐‘

The 'four fifths rule' serves as a benchmark here. A ratio that falls below 0.8 or exceeds 1.25 flags potential bias, where one subgroup is more commonly designated to the positive outcome compared to another subgroup. Yet, it is essential to interpret these metrics in the context of binary outcomes, ensuring that the analysis is rooted in real-world implications.

With this in mind, some of the best practices for mitigating bias in business contexts are:

  • Blind processing: Omit protected attributes during model training. For instance, remove gender or race from credit scoring algorithms.
  • Regular audits: Frequently reassess models using fresh data to detect and rectify emergent biases.
  • Avoiding proxy variables: Sometimes, even when direct sensitive attributes are omitted, other unrelated variables can act as stand-ins or โ€˜proxiesโ€™ for them. For example, zip codes might inadvertently correlate with race or socioeconomic status. Ensuring that such proxy variables do not introduce bias is crucial. It involves rigorous testing, feature analysis, and sometimes even domain-specific knowledge to identify and manage these proxies.
  • Conditional statistical parity: This advanced layer of analysis ensures fairness by holding certain variables constant. It focuses on how protected attributes might influence outcomes when other influencing metrics are comparable.

Achieving genuine fairness in algorithmic processes goes beyond mere demographic balance. It requires diligent, multifaceted efforts, with businesses striving for both impartiality and adaptability. As data-driven decisions become more ingrained in our society, ensuring fairness is not only ethically sound but also pivotal for maintaining trust and fostering robust customer relationships.

Detecting and mitigating bias in medical diagnostics: The adaptive output paradigm

In the medical domain, individual differences, inclusive of protected attributes, are pivotal. Unlike in many business applications where the goal is demographic parity, medical algorithms often need to adjust their outputs based on these attributes.

This adaptation is because symptomatology and treatment effectiveness can differ significantly across demographic groups. Thus, in medicine, achieving optimal patient outcomes often necessitates basing decisions on demographic characteristics.

Distinguishing features in medical diagnostics

Fairness in medicine isn't about treating everyone identically, but about recognizing and accurately accounting for legitimate biological and physiological differences.

The challenge is navigating the fine line between essential demographic considerations and avoiding potential biases. In non-medical sectors, the objective is often to reduce the influence of protected attributes. But in medicine, attributes like ethnicity or genetic markers can be pivotal in predicting the best outcomes. For instance, though weight typically influences anaesthesia dosage, some attributes, like genetic markers, might alter drug metabolism rates, showcasing that models should sometimes function differently across groups to ensure optimal care.

Let's take Warfarin as an example. Different ethnic groups metabolize this anticoagulant at different rates. Therefore, the ideal dosage varies as a function of the ethnic background. This requires careful monitoring and adjustments to dosages based on individual responses and their ethnic background.

To examine the issue more closely, we will use the Statistical Parity Difference metric to measure the difference in the probability of positive outcomes between protected and non-protected groups.

โ€

Given:

P = Proportion of the positive outcomes in the protected group

U = Proportion of the positive outcomes in the non-protected group

โ€

Formula:


SPD= Pโˆ’U


In our Warfarin example, if one ethnic group has a 70% probability of achieving the therapeutic range, and another has an 80% probability, the Statistical Parity Difference would be -10%.

This means that the protected ethnic group is 10% less likely to achieve the therapeutic range compared to the non-protected group. This indicates a potential disadvantage for the protected group in the current system or model. Decision-makers would need to assess the reasons for this discrepancy and determine whether it is due to genuine physiological differences, biases in the system, or other factors.

If we were to use an algorithm programmed to achieve equal treatment, the outcome would be problematic, due to the inherent differences from different ethnic groups and their metabolisms.

To ensure that every group receives the correct dosage we would need to use an โ€˜adaptive algorithmโ€™, incorporating ethnicity as a predictor. Therefore, while patients from different ethnicities might receive different treatments (i.e., dosages), the end goal remains the same: optimal therapeutic results and minimized side effects. This is a poignant example of where striving for "equal treatment" using a singular approach might not be the best way to ensure fairness and efficacy. Instead, by embracing the individual variations and tailoring treatments accordingly, we can achieve the desired equitable outcomes for all patients.

With this in mind, some of the best practices for mitigating bias in medical diagnostic contexts are:

  • โ€Granular data collection: It's vital to ethically gather as much demographic data as possible, allowing for more individualized and accurate dosing or treatment recommendations.โ€
  • Counterfactual testing: For each treatment recommendation or diagnosis, evaluate how the prediction might have varied if the demographic attribute were different, but all other factors remained constant.โ€
  • Stakeholder collaboration: Involve clinicians, patients, ethicists, and tech experts in the process. Their collective insights ensure holistic validation of models.

How to mitigate bias when business and medicine mix

Transitioning between these two paradigms, business and medical diagnostics, requires a deep understanding of the relevant context.

For example, suppose an AI model predicts life expectancy for a life insurance application. Here, while age and perhaps gender might be valid predictors (akin to medical use), other attributes like ethnicity or country of residence should likely be treated with the same caution as in business contexts if there is a lack of evidence that they influence life expectancy.

Simply put, more โ€˜protectedโ€™ variables will be influencing the modelโ€™s behaviour. This particular use case will make complex, life impacting decisions, balancing the effect of sensitive variables.

Best practices include:

  • Hybrid models: In mixed contexts, like life expectancy prediction, develop models that treat different protected attributes with varying levels of scrutiny.
  • Transparency and accountability: Implement measures to explain decisions, especially when protected attributes influence outcomes. Open channels for challenges and appeals.

In contexts where both the business and medical paradigms are combined, the complexity escalates. Distinguishing between valid differentiation and potential discrimination must underpinned by rigorous science.

Technical risk assessment challenges: Bias and privacy

In a world where AI models increasingly drive decisions, two major technical challenges surface at the intersection of fairness and privacy: risk assessment for bias and preserving data privacy. These challenges often conflict, as addressing one can exacerbate the other. Hereโ€™s a technical breakdown:

Bias risk assessment

Detecting and mitigating bias in AI models involves nuanced computational and statistical tasks:

  • High-dimensional spaces: Modern AI models, especially deep learning models, operate in high-dimensional spaces. Evaluating fairness in such spaces is non-trivial. For instance, understanding how a neural network differentially weighs age, gender, and ethnicity concurrently in a 100-layer deep model is incredibly complex.
  • โ€Confounding variables: Bias detection gets muddled when protected attributes correlate with other features. For example, a zip code might be a proxy for race or socioeconomic status, making it hard to disentangle genuine predictors from biased influences.
  • โ€Distributional shifts: Data distributions evolve. A model trained on data from one time period or region might manifest biases when applied elsewhere, requiring continual bias risk assessments.
  • โ€Model complexity vs. interpretability: Simpler, interpretable models can be easier to audit for bias but might sacrifice accuracy. Complex models, like ensemble methods or deep networks, might offer better performance but are harder to interpret and audit.

Privacy concerns

Ensuring privacy while training models and conducting bias assessments is riddled with technical challenges:

  • Differential privacy: While differential privacy offers a mechanism to add noise to data or queries to ensure individual data points arenโ€™t identifiable, it can reduce the utility of the data. This trade-off between privacy and accuracy is a challenge, especially in sensitive applications like healthcare.
  • Data synthesis: Generating synthetic data to train models without exposing original data seems promising. However, ensuring the synthetic data is both representative and free from biases of the original dataset is technically demanding.
  • Federated learning: This technique allows model training across multiple devices or servers while keeping data localized. But it introduces complexities in ensuring consistent model performance and fairness across diverse, decentralized data sources.
  • Homomorphic encryption: This technique allows computation on encrypted data, ensuring data privacy. On the other hand, it is computationally expensive and ensuring fairness in models trained this way is a complex endeavour.

The interplay between bias and privacy

Addressing bias and privacy simultaneously is intricate:

  • Bias detection vs. data anonymization: Anonymized data, stripped of direct identifiers, might still contain biases in latent forms. Yet, the very process of de-identification can obscure critical patterns necessary for bias detection.
  • Regularization for fairness vs. differential privacy: Techniques to introduce fairness as a constraint in model training can conflict with differential privacy mechanisms. For instance, a fairness regularization term might require demographic data, but differential privacy might obfuscate this information.

Mitigate bias, protect your business

Tackling the technical challenges of bias and privacy risk assessments in AI requires a multidisciplinary approach. Innovations in algorithms, coupled with stringent ethical frameworks, are crucial to navigate the fine line between creating AI models that are both fair and respectful of individual privacy.

Holistic AI have pioneered the field of AI auditing, with a focus on bias as well as efficacy, robustness, privacy and explainability.

Without our audits, your enterprise can build customer trust and obtain the tools for compliance with existing and emerging AI regulations.

Discover your path to bias-free AI. Schedule a call with one of our bias audit experts for a tailored consultation.

DISCLAIMER: This blog article is for informational purposes only. This blog article is not intended to, and does not, provide legal advice or a legal opinion. It is not a do-it-yourself guide to resolving legal issues or handling litigation. This blog article is not a substitute for experienced legal counsel and does not provide legal advice regarding any situation or employer.

Subscriber to our Newsletter
Join our mailing list to receive the latest news and updates.
Weโ€™re committed to your privacy. Holistic AI uses this information to contact you about relevant information, news, and services. You may unsubscribe at anytime. Privacy Policy.

See the industry-leading AI governance platform in action

Schedule a call with one of our experts

Get a demo