In our ongoing series on Enhancing Transparency in AI, we delve into the crucial aspect of comprehending machine learning model outcomes through insightful explainability metrics. For an overview of why explainability metrics are important for transparent and trustworthy AI, check out our guide on explainability metrics using Holistic AI Library. For a deeper dive into our original research on measuring explainability in machine learning, explore our research paper on explainability.
In this article, we shed light on the metrics derived from SHAP feature importance, providing a comprehensive understanding of your model’s performance.
Additive feature attribution methods are well-defined in literature and have numerous applications. Several models adhere to this additive attribution principle. The Shapley Additive Explanations (SHAP) method leverages Shapley values to calculate feature attributions. To explain this, let’s define concepts:
After installing the HAI library we need to load the dataset and split the data into train and test sets. In this tutorial, we use the Law School dataset. We have used this dataset previously with bias metrics from our library, but here we focus on explainability.
The goal of this dataset is the prediction of the binary attribute ‘bar’ (whether a student passes the law school bar). We can use a machine learning model to make this classification and have used a simple logistic regression model in this tutorial:
As can be seen in the figure below, based on the performance metrics we calculated in the last step of the below code, the regression model formed reasonably well.
The Explainer class is used to compute metrics and generate graphs related to these metrics. In this sense, some parameters are important for the successful implementation. The “based-on” parameter defines the type of strategy that will be used — in this case, we use strategies based on feature importance. The “strategy_type” parameter is used to select the strategy type, namely SHAP. Additionally, we need to define the model type (binary_classification), the model object, features used in training (X_train) and targets used in training (y_train).
After instantiating the explainability object for the model results, we can compute the metrics. With the HAI library, this process is simplified through the metrics function. In this example, we use the parameter detailed=True to visualize the results for labels 0 and 1.
As can be seen below, the values computed are relatively small and close to the target value of 0.
Another important tool that is possible to access using the Explainer object is plots. As an example, the following code snippets show the bar plot with the feature importance ranking and the box plot for data stability and feature stability.
In this tutorial, we focused on the SHAP feature importance tool within the Holistic AI library’s explainability module. We learned how to calculate explainability metrics and generate visualizations that reveal the key factors influencing a model’s predictions. Remember, these techniques offer the valuable benefit of adaptability, allowing you to apply them across diverse datasets and scenarios.
Interested in exploring explainable AI further? Reach out for a demo of our AI governance platform, read our paper on explainability metrics for AI, or start exploring metrics on your own data using the Holistic AI Library.
DISCLAIMER: This blog article is for informational purposes only. This blog article is not intended to, and does not, provide legal advice or a legal opinion. It is not a do-it-yourself guide to resolving legal issues or handling litigation. This blog article is not a substitute for experienced legal counsel and does not provide legal advice regarding any situation or employer.
Schedule a call with one of our experts