Visualising data is crucial in any analysis. It facilitates an intuitive understanding of the numbers, allowing us to identify patterns, discrepancies and trends. In machine learning, this serves a particuarly useful purpose. A visual representation of data can be a catalyst for informed decision-making, which can assist in ridding AI systems of damaging biases that impact both their efficacy and fairness.
This effect can be achieved by building an interactive bias measuring and mitigation plot in Python, using the Holistic AI, sklearn and Plotly libraries. This implementation doesn’t need local installations. All steps will be constructed in Google Colab.
The Holistic AI library is an open-source tool to assess and improve the trustworthiness of AI systems. The current version of the library offers a set of techniques to easily measure and mitigate bias across a variety of tasks.
To get started building the interactive bias measuring and mitigation plots, we first must import the necessary libraries and data. We will be using the Holistic AI Library to implement a set of bias-mitigation techniques, while the sklearn and Plotly libraries will be used for training/testing our machine learning models and creating interactive visualisations respectively. To demonstrate the process, we will be using a data set which centres on the law school bar pass rates of white and non-white students, with protected attributes of race and gender. We pay special attention to race in this case, as preliminary exploration hints at strong inequality in this sensitive attribute.
In our interactive visualisation, we will cover bias metrics and accuracy. When building machine learning models, it is important to assess their accuracy, defined as a measure of how well the model performs on the data on which it is trained. However, accuracy alone is not enough to determine the trustworthiness of a machine learning model. We also need to assess whether the model is biased and if that bias is leading to the unfair treatment of certain groups of people – in our example, non-white applicants to law school.
The code below details how these metrics can be applied.
In the penultimate step, we will create interactive plots to visualise the bias metrics and accuracy of our machine learning model. In this context, we can see the correlation between these two metrics. How does the accuracy-bias trade-off change with variations in model parameters? The figure illustrates this relationship by showing how the disparate impact changes as the number of estimators of a ‘random forest’ model – a machine learning algorithm combining multiple decision trees – increase.
As the finished product above shows, in this article we have demonstrated how to build interactive bias measuring and mitigation plots in Python using the Holistic AI, sklearn and Plotly libraries. By creating simple visualisations and presenting the data in an engaging manner, we can better understand the results of the bias mitigation techniques used and gain essential insights into the performance of our machine learning models.
While we focused on a specific data set in this example, you can use different configurations to suit your needs. To see the interactive graph in action, access the code via this Colab link.
Discover additional methods for assessing and addressing bias by exploring the Holistic AI Library, an open-source tool aimed at enhancing the trustworthiness and transparency of AI systems.
DISCLAIMER: This blog article is for informational purposes only. This blog article is not intended to, and does not, provide legal advice or a legal opinion. It is not a do-it-yourself guide to resolving legal issues or handling litigation. This blog article is not a substitute for experienced legal counsel and does not provide legal advice regarding any situation or employer.
Schedule a call with one of our experts