Enhancing Transparency: Explainability Metrics via Permutation Feature Importance using Holistic AI Library

Authored by
Kleyton da Costa
Machine Learning Researcher at Holistic AI
Published on
Oct 24, 2023
read time
0
min read
share this
Enhancing Transparency: Explainability Metrics via Permutation Feature Importance using Holistic AI Library

In machine learning, explainability metrics are crucial because they provide insight into how a model makes decisions, fostering transparency, trust, and accountability in AI systems.

One of the easiest ways to compute explainability metrics for machine learning models is through the Holistic AI Library, an open-source tool designed to assess and improve the trustworthiness of AI systems.

In this blog post, we will introduce the concept of permutation feature importance and explain how to build more transparent machine learning models with it. Additionally, we will showcase a practical application of explainability metrics.

What is permutation feature importance?

Permutation feature importance is a valuable tool in the realm of machine learning explainability. Unlike model-specific methods, it is model-agnostic, meaning it can be applied to various types of predictive algorithms, such as linear regression, random forests, support vector machines, and neural networks. This universality makes it particularly useful when dealing with a diverse range of models, providing a platform to understand their inner workings.

The process of calculating permutation feature importance involves systematically shuffling the values of a single feature while keeping the other set of features unchanged. By doing so and reevaluating the model’s performance, we can observe how the shuffling impacts the predictive accuracy or performance metric of the model.

A feature that significantly affects the model’s performance will demonstrate a considerable drop in predictive accuracy when its values are permuted, highlighting its importance.

Explainability metrics with the Holistic AI library

In this application, we will train two classification models (Random Forest and AdaBoost) as applied to the commonly used Adult dataset. The problem to be solved by this dataset is as follows: given a set of financial and personal characteristics about a specific individual, how should we classify them in relation to their income – do they earn more than 60K per year or less than 50K per year?

The first step is simplified data preprocessing. The main change in this dataset will be to perform a transformation on the categorical variables using the `pd.get_dummies` function.


from holisticai.datasets import load_adult 
 
# load dataset 
dataset = load_adult() 
 
# concat dataset in a single dataframe 
df = pd.concat([dataset["data"], dataset["target"]], axis=1) 
output_variable = ["class"] 
 
# split data in X and y 
y = df[output_variable].replace({">50K": 1, "<=50K": 0}) 
X = pd.get_dummies(df.drop(output_variable, axis=1)) 

Next, we will use the Explainer case to compute the permutation feature importance of the model. After instantiating the Explainer case, we can calculate the metrics and generate the result plots.

We can define strategy_type as permutation, surrogate, lime, shap or shap-text. For the following example, we are using permutation.


explainer = Explainer(based_on='feature_importance', 
                      strategy_type='permutation', 
                      model_type='binary_classification', 
                      model = model,  
                     x = X,  
                     y = y) 

A quick way to compute explainability metrics using the Holistic AI Library is by calling the metrics function from the explainer object. This way, feature importance metrics are computed quickly and conveniently.


explainer.metrics() 

  • Fourth Fifths: Shows that 68% of features properly explain the model output.
  • Importance Spread Divergence: Shows the entropy of global feature importance. This metric is of interest when we compare it with another model.
  • Importance Spread Ratio: Shows that feature importance is not concentrated in a few features. This happens because the metric result is close to 1 (uniform importance spread) instead of close to 0 (high importance concentration).
  • Global Overlap Score: This score suggests that the model has frequent changes in each feature importance ranking. On average, 8.13% of features maintained the same ranking position when compared with the global feature importance.
  • Global Range Overlap Score: Indicates that feature importance rankings remain consistent near specific data subsets, with 88.65% of features retaining their rank within these subsets.
  • Global Similarity Score: The values of feature importance in each group display a huge change between classes. This can be observed by comparing the ranking between the feature importance tables below. For example, the feature age has high importance for label=0 but a very low importance for label=1 and global importance.

# feature importance table sorted by label=0 
permutation_explainer.feature_importance_table(sorted_by='[label=0]') 

Figure: Feature importance table ranking by label=0
Figure: Feature importance table ranking by label=0


# feature importance table sorted by label=1 
permutation_explainer.feature_importance_table(sorted_by='[label=1]') 

Figure: Feature importance table ranking by label=1
Figure: Feature importance table ranking by label=1

  • Global Explainability Ease Score: This metric shows that is simple to interpret the curves of the partial dependence plot for this model. For this, we can use the following code:

_,ax = plt.subplots(figsize=(15,5)) 
permutation_explainer.partial_dependence_plot(last=3, ax=ax, kind='both') 
# the parameter "last" define the number of features 

A partial dependence plot is a useful tool for understanding how an independent variable (or a set of variables) affects the prediction of a machine learning model while keeping other variables constant.

And how do we interpret the curve? If the curve goes up or down, it indicates the direction of the effect of the independent variable on the model’s response. If the curve is flat, the independent variable has relatively little impact on the model’s output. The slope of the curve shows the size of the effect – a steep slope indicates a strong effect, while a gentle slope indicates a weaker effect.

In this case, we can show that education-num and marital-status_Married-civ-spouse have a positive impact on predictions. Otherwise, capital-gain has a high impact on the model’s output for a certain period, after it becomes flat.

Summary

In this article, we highlighted a key feature of the Holistic AI library's explainability module: permutation feature importance. By computing specific explainability metrics and generating partial dependence plots for various features, we offered a method to enhance transparency in intelligent systems across diverse datasets and contexts.

DISCLAIMER: This blog article is for informational purposes only. This blog article is not intended to, and does not, provide legal advice or a legal opinion. It is not a do-it-yourself guide to resolving legal issues or handling litigation. This blog article is not a substitute for experienced legal counsel and does not provide legal advice regarding any situation or employer.

Subscriber to our Newsletter
Join our mailing list to receive the latest news and updates.
We’re committed to your privacy. Holistic AI uses this information to contact you about relevant information, news, and services. You may unsubscribe at anytime. Privacy Policy.

See the industry-leading AI governance platform in action

Schedule a call with one of our experts

Get a demo