Artificial intelligence (AI) is increasingly being used in both traditional business sectors and in medical diagnostics. But there are discrepancies in the ways the respective fields treat the issue of bias when it comes to 'protected attributes', such as age, race and gender.
This blog explores the differences between how the two fields should approach bias detection and mitigation from a technical standpoint.
In mainstream business, the overarching goal is often to reduce the influence of protected attributes. Letβs take an example from credit scoring, which is a potentially controversial use case.
Ideally, if Jane (female) and John (male) both have the same income, credit history, and financial behaviour, their credit scores should be almost identical. While credit scores are commonly continuous, or have a range of possible variables, the application of credit score typically results in a binary outcome, such as whether individuals qualify for a loan based on a specific score threshold. This dichotomy enables a clearer evaluation of βfavourableβ versus βunfavourableβ outcomes, providing a more structured framework for detecting bias in terms of equal treatment or equal outcomes.
Calculation: To detect bias, analysts often compute the Disparate Impact Ratio:
The 'four fifths rule' serves as a benchmark here. A ratio that falls below 0.8 or exceeds 1.25 flags potential bias, where one subgroup is more commonly designated to the positive outcome compared to another subgroup. Yet, it is essential to interpret these metrics in the context of binary outcomes, ensuring that the analysis is rooted in real-world implications.
With this in mind, some of the best practices for mitigating bias in business contexts are:
Achieving genuine fairness in algorithmic processes goes beyond mere demographic balance. It requires diligent, multifaceted efforts, with businesses striving for both impartiality and adaptability. As data-driven decisions become more ingrained in our society, ensuring fairness is not only ethically sound but also pivotal for maintaining trust and fostering robust customer relationships.
In the medical domain, individual differences, inclusive of protected attributes, are pivotal. Unlike in many business applications where the goal is demographic parity, medical algorithms often need to adjust their outputs based on these attributes.
This adaptation is because symptomatology and treatment effectiveness can differ significantly across demographic groups. Thus, in medicine, achieving optimal patient outcomes often necessitates basing decisions on demographic characteristics.
Fairness in medicine isn't about treating everyone identically, but about recognizing and accurately accounting for legitimate biological and physiological differences.
The challenge is navigating the fine line between essential demographic considerations and avoiding potential biases. In non-medical sectors, the objective is often to reduce the influence of protected attributes. But in medicine, attributes like ethnicity or genetic markers can be pivotal in predicting the best outcomes. For instance, though weight typically influences anaesthesia dosage, some attributes, like genetic markers, might alter drug metabolism rates, showcasing that models should sometimes function differently across groups to ensure optimal care.
Let's take Warfarin as an example. Different ethnic groups metabolize this anticoagulant at different rates. Therefore, the ideal dosage varies as a function of the ethnic background. This requires careful monitoring and adjustments to dosages based on individual responses and their ethnic background.
To examine the issue more closely, we will use the Statistical Parity Difference metric to measure the difference in the probability of positive outcomes between protected and non-protected groups.
β
Given:
P = Proportion of the positive outcomes in the protected group
U = Proportion of the positive outcomes in the non-protected group
β
Formula:
In our Warfarin example, if one ethnic group has a 70% probability of achieving the therapeutic range, and another has an 80% probability, the Statistical Parity Difference would be -10%.
This means that the protected ethnic group is 10% less likely to achieve the therapeutic range compared to the non-protected group. This indicates a potential disadvantage for the protected group in the current system or model. Decision-makers would need to assess the reasons for this discrepancy and determine whether it is due to genuine physiological differences, biases in the system, or other factors.
If we were to use an algorithm programmed to achieve equal treatment, the outcome would be problematic, due to the inherent differences from different ethnic groups and their metabolisms.
To ensure that every group receives the correct dosage we would need to use an βadaptive algorithmβ, incorporating ethnicity as a predictor. Therefore, while patients from different ethnicities might receive different treatments (i.e., dosages), the end goal remains the same: optimal therapeutic results and minimized side effects. This is a poignant example of where striving for "equal treatment" using a singular approach might not be the best way to ensure fairness and efficacy. Instead, by embracing the individual variations and tailoring treatments accordingly, we can achieve the desired equitable outcomes for all patients.
With this in mind, some of the best practices for mitigating bias in medical diagnostic contexts are:
Transitioning between these two paradigms, business and medical diagnostics, requires a deep understanding of the relevant context.
For example, suppose an AI model predicts life expectancy for a life insurance application. Here, while age and perhaps gender might be valid predictors (akin to medical use), other attributes like ethnicity or country of residence should likely be treated with the same caution as in business contexts if there is a lack of evidence that they influence life expectancy.
Simply put, more βprotectedβ variables will be influencing the modelβs behaviour. This particular use case will make complex, life impacting decisions, balancing the effect of sensitive variables.
Best practices include:
In contexts where both the business and medical paradigms are combined, the complexity escalates. Distinguishing between valid differentiation and potential discrimination must underpinned by rigorous science.
In a world where AI models increasingly drive decisions, two major technical challenges surface at the intersection of fairness and privacy: risk assessment for bias and preserving data privacy. These challenges often conflict, as addressing one can exacerbate the other. Hereβs a technical breakdown:
Detecting and mitigating bias in AI models involves nuanced computational and statistical tasks:
Ensuring privacy while training models and conducting bias assessments is riddled with technical challenges:
Addressing bias and privacy simultaneously is intricate:
Tackling the technical challenges of bias and privacy risk assessments in AI requires a multidisciplinary approach. Innovations in algorithms, coupled with stringent ethical frameworks, are crucial to navigate the fine line between creating AI models that are both fair and respectful of individual privacy.
Holistic AI have pioneered the field of AI auditing, with a focus on bias as well as efficacy, robustness, privacy and explainability.
Without our audits, your enterprise can build customer trust and obtain the tools for compliance with existing and emerging AI regulations.
Discover your path to bias-free AI. Schedule a call with one of our bias audit experts for a tailored consultation.
DISCLAIMER: This blog article is for informational purposes only. This blog article is not intended to, and does not, provide legal advice or a legal opinion. It is not a do-it-yourself guide to resolving legal issues or handling litigation. This blog article is not a substitute for experienced legal counsel and does not provide legal advice regarding any situation or employer.
Schedule a call with one of our experts