Data Science

Debiasing Artificial Neural Networks with Holistic AI and PyTorch

Authored By

Published on

July 11, 2023

Champions of the school of thought known as responsible AI dream of a future in which artificial intelligence is free from pre-existing societal biases – and there is now a plethora of resources available to help actualise this vision.

In this blog post, we explore how two of them, the Holistic AI and PyTorch libraries, can be used to implement bias mitigation strategies for artificial neural networks.

Debiasing artificial neural networks

The Holistic AI Library is an open-source resource that provides tools for analysing and visualising bias in machine learning models. It provides various metrics to evaluate model performance and identify potential biases, such as statistical parity, equal opportunity, and equalised odds. PyTorch meanwhile is a popular machine learning library that provides tools for building and training neural network models.

In this example, we train a baseline model initially, without the application of mitigation strategies. After that, we retrain the model with a mitigation strategy, and the results are compared using bias metrics.

Load data and simple pre-processing

Our objective is to create an architecture that can mitigate biases present in training data. To achieve this, we will use PyTorch to build a multilayer perceptron (MLP) model. An MLP is a neural network consisting of multiple layers of interconnected nodes, which can learn complex relationships between input and output data.

For this implementation, we are using the German Credit Dataset. The German Credit Dataset is a well-known within machine learning and data analysis, which contains information about credit applications submitted to a German bank. The dataset is often used as a benchmark for evaluating different machine learning algorithms and models for credit risk assessment.


import pandas as pd
import torch
import torch.nn as nn 
import torch.optim as optim 

from sklearn.model_selection import train_test_split 
from holisticai.bias.metrics import classification_bias_metrics 

# Load the German Credit dataset 

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/german.data" 

names = ["checking_account", "duration", "credit_history", "purpose", "credit_amount", 

         "savings_account", "employment", "installment_rate", "personal_status", "other_debtors", 

         "residence", "property", "age", "other_installment_plans", "housing", 

         "number_credits", "job", "people_liable", "telephone", "foreign_worker", "class"] 

df = pd.read_csv(url, sep=" ", header=None, names=names) 

# Preprocess the data 

X = pd.get_dummies(df.drop("class", axis=1)) 
y = df["class"].replace({1: 1, 2: 0}) 

group = ["foreign_worker"] 
group_a = df[group] == 'A202' 
group_b = df[group] == 'A201' 

data_ = [X, y, group_a, group_b] 

# Split data in train and test sets 

dataset = train_test_split(*data_, test_size=0.2, shuffle=True) 

train_data = dataset[::2] 
test_data = dataset[1::2] 

# Split the data into train and test sets 

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) 

# Convert the data to PyTorch tensors 

X_train = torch.from_numpy(X_train.values.astype(np.float32)).float() 
X_test = torch.from_numpy(X_test.values.astype(np.float32)).float() 
y_train = torch.from_numpy(y_train.values.astype(np.float32)).float().view(-1, 1) 
y_test = torch.from_numpy(y_test.values.astype(np.float32)).float().view(-1, 1)

Model training


# Define the neural network architecture 

class NeuralNet(nn.Module):

def __init__(self, input_dim, hidden_dim, output_dim): 

        super(NeuralNet, self).__init__() 
        self.fc1 = nn.Linear(input_dim, hidden_dim) 
        self.relu = nn.ReLU() 
        self.fc2 = nn.Linear(hidden_dim, output_dim) 

def forward(self, x): 

        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

# Define the model, loss function, and optimizer 

input_dim = X_train.shape[1]
hidden_dim = 10
output_dim = 1
net = NeuralNet(input_dim, hidden_dim, output_dim)
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(net.parameters(), lr=0.01)

# Train the model 

num_epochs = 100 
for epoch in range(num_epochs): 

    optimizer.zero_grad()
    outputs = net(X_train)
    loss = criterion(outputs, y_train)
    loss.backward()
    optimizer.step()
    if (epoch+1) % 10 == 0:

        print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, loss.item())) 
 
# Test the model 

with torch.no_grad(): 
    outputs = net(X_test)
    y_pred = (outputs > 0.5).float()
    accuracy = (y_pred == y_test).float().mean()
    print('Accuracy: {:.4f}'.format(accuracy))

y_pred = y_pred.numpy() 
X_test, y_test, group_a, group_b = test_data 

# baseline metrics for equality of outcomes and equality of opportunity 

metrics_baseline = classification_bias_metrics(group_a, group_b, y_pred, y_test, metric_type='both')

Training with mitigator

In this implementation we use the Disparate Impact Remover, a pre-processing bias mitigation strategy. This method modifies the values of some features in order to reduce bias, while preserving the rank order within each group.


# import mitigation strategy 

from holisticai.bias.mitigation import DisparateImpactRemover

X_train, y_train, group_a, group_b = train_data 
preprocessing_mitigator = DisparateImpactRemover(repair_level=1.0)
fit_params = {"group_a": group_a, "group_b": group_b}
X = preprocessing_mitigator.fit_transform(X_train, **fit_params)
X_train = torch.from_numpy(X).float()
y_train = torch.from_numpy(y_train.values.astype(np.float32)).float().view(-1, 1)

num_epochs = 100 

for epoch in range(num_epochs): 

    optimizer.zero_grad()
    outputs = net(X_train)
    loss = criterion(outputs, y_train)
    loss.backward()
    optimizer.step()
    if (epoch+1) % 10 == 0: 

        print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, loss.item())) 

 X_test, y_test, group_a, group_b = test_data
fit_params = {"group_a": group_a, "group_b": group_b}
X = preprocessing_mitigator.transform(X_test, **fit_params)
X_test = torch.from_numpy(X).float()
y_test = torch.from_numpy(y_test.values.astype(np.float32)).float().view(-1, 1)

# Test the model 

with torch.no_grad(): 

    outputs = net(X_test)
    y_pred = (outputs > 0.5).float()
    accuracy = (y_pred == y_test).float().mean()
    print('Accuracy: {:.4f}'.format(accuracy))

# mitigated metrics for equality of outcomes and equality of opportunity 

metrics_mitigated = classification_bias_metrics(group_a, group_b, y_pred, y_test, metric_type='both')

Results comparison

As the use of machine learning continues to soar across a wide range of applications, it is critical to develop and implement effective bias mitigation strategies to ensure that these models are fair and free of prejudice. As evidenced in this blog, the Holistic AI and PyTorch libraries are an effective way to achieve this aim in artificial neural networks. We can illustrate this by creating a table to compare the results.


# create a table to compare the results

results = pd.concat([metrics_baseline['Value'], metrics_mitigated[['Value', 'Reference']]], axis = 1) results.columns = ['Baseline', 'Mitigated', 'Reference']

‍

Summary

Once the MLP model is built, the Holistic AI library can be used to analyse its performances and identify any biases that may be present. The open-source resource provides a set of tools that can mitigate these biases, such as reweighting the training data, adjusting the decision threshold, or even modifying the model architecture with inprocessing strategies.

Combining the Holistic AI Library with PyTorch allows us to create a robust and unbiased machine learning model that can be used in various real-world applications, ensuring that decisions made by the model are fair and equitable for all individuals, regardless of their race, gender, or other personal attributes.

Heading 2

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.