Visualising Bias Metrics: Insights from Holistic AI's Open-Source Library

Authored by

Machine Learning Researcher at Holistic AI

Published on

August 21, 2023

last updated on

August 21, 2024

Unveiling bias metrics through the open-source Holistic AI Library

With the increasing use of artificial intelligence-based models and the growing need to validate the results generated by these models, there is growing need for the creation of visualisations that aid in understanding the model itself, as well as its decision-making process.

The purpose of this blog post is to demonstrate how to generate graphs for the results of bias metrics calculated through the Holistic AI Library, an open-source resource for improving the trustworthiness of AI systems.

For our example, we will use a regression task for machine learning models.

Report for regression tasks

In this example, we'll tackle a regression challenge using the Adult Dataset. This well-known dataset is available in the Holistic AI Library and is widely used for conducting analyses with machine learning models.

Below is the code used to generate the visualisations.


# base imports 
import pandas as pd 
import numpy as np 
 
# Report Plots 
from holisticai.bias.plots import bias_metrics_report 
 

# import datasets 
from holisticai.datasets import load_adult 
 
df = load_adult()['frame'] 
# import bias metrics  

from holisticai.bias.metrics import regression_bias_metrics  

from sklearn.model_selection import train_test_split  

from sklearn.preprocessing import OneHotEncoder, StandardScaler 

from sklearn.linear_model import LinearRegression  

 

x = df[['capital-gain', 'capital-loss', 'hours-per-week']]  

encoder = OneHotEncoder() 

enc = encoder.fit_transform(df['sex'].array.reshape(-1,1)) 

enc = pd.DataFrame(enc.toarray(), columns = ['sex_male', 'sex_female']) 

x_t = pd.concat([x, enc], axis=1) 

 

scaler = StandardScaler() 

x_scaled = scaler.fit_transform(x_t) 

x_scaled = pd.DataFrame(x_scaled, columns = [x_t.columns]) 

y = df['fnlwgt']  

y = scaler.fit_transform(y.array.reshape(-1, 1)) 

 

x_train, x_test, y_train, y_test = train_test_split(x_t, y, test_size = 0.3, random_state = 0)  

 

group_a = x_test['sex_male'] 

group_b = x_test['sex_female'] 
 

model= LinearRegression()  

model.fit(x_train, y_train)   

y_pred = model.predict(x_test)  

y_true  = y_test  

 

from holisticai.bias.metrics import regression_bias_metrics 

y_true = y_test 

metrics = regression_bias_metrics(group_a, group_b, y_pred, y_true, metric_type = 'both')

Baseline metrics (without mitigation)

We can observe the results table for the baseline models (without mitigation strategy). It is noticeable that the table is useful for evaluating the model, but with a plot, the interpretation tends to be improved.


bias_metrics_report('regression', metrics)

Report for bias mitigation strategy

For our example, we apply a preprocessing mitigation strategy called Correlation Remover. This algorithm modifies the original dataset by eliminating any correlations with sensitive values. This is achieved by applying a linear transformation to the non-sensitive feature columns of the dataset.

The implementation of the mitigation strategy is described below.


# generate plot with mitigation Correlation Remover 

from holisticai.bias.mitigation import CorrelationRemover 

corr = CorrelationRemover() 

test = corr.fit_transform(x_test, group_a, group_b) 

y_pred_mitigated = model.predict(test) 

metrics_mitigated = regression_bias_metrics(group_a, group_b, y_pred_mitigated, y_true, metric_type = 'both') 

metrics mitigated


# plot report of bias and mitigated outputs 

bias_metrics_report('regression', metrics, metrics_mitigated)

As we can observe, the graphs aid in visualising the mitigation results. By observing metrics such as Z-Score Difference, RMSE Ratio Q80, and MAE Ratio Q80, it becomes clear that the mitigation strategy successfully enhanced the fairness aspects of the model's prediction.

The generated plots enhance the understanding and aid an accurate interpretation of the model's performance. By visually representing the outcomes of bias measurements, these plots provide valuable insights into the model's behaviour, its potential strengths, and areas for improvement. They serve as a compass for decision-makers, guiding them toward a more comprehensive and informed evaluation of the model's fairness and effectiveness.

Explore the Holistic AI Library

The Holistic AI Li```-brary is an open-source resource designed to elevate the trustworthiness of AI systems. It provides an array of techniques tailored to measure and combat bias across diverse tasks.

It encompasses techniques across five key risk areas in total: Bias, Efficacy, Robustness, Privacy, and Explainability. The broad spectrum of tools supplied within the library enables the comprehensive assessment of AI systems and applications, providing a platform for transparent and reliable AI.

DISCLAIMER: This blog article is for informational purposes only. This blog article is not intended to, and does not, provide legal advice or a legal opinion. It is not a do-it-yourself guide to resolving legal issues or handling litigation. This blog article is not a substitute for experienced legal counsel and does not provide legal advice regarding any situation or employer.

Holistic AI OSL Library

Table of contents

Heading 2

Heading 3