Evaluating Explainability for Machine Learning Predictions using Model-Agnostic Metrics

Published on

February 1, 2024

AI systems are being integrated into nearly every industry and sector, creating a demand from decision-makers to possess a comprehensive and nuanced understanding of the capabilities and limitations of these systems. One critical aspect of this demand is the ability to explain the results of machine learning models, which is crucial to promoting transparency and trust in AI systems, as well as fundamental in helping machine learning models to be trained ethically. In this paper, we present novel metrics to quantify the degree to which AI model predictions can be easily explainable by its features. Our metrics summarize different aspects of explainability into scalars, providing a more comprehensive understanding of model predictions and facilitating communication between decision-makers and stakeholders, thereby increasing the overall transparency and accountability of AI systems.

In our recent peer-reviewed paper Holistic AI researchers and additional contributors highlight the importance of explainability in AI, defining explainability in terms of interoperability. After a discussion of existing literature, we introduce a set of computational model-agnostic metrics to support explainability in AI. We then apply these metrics in a series of experiments. Metrics include:

Feature Importance-based Metrics: Evaluate the distribution and stability of feature importance across the model, using concepts like entropy and divergence to measure the spread and concentration of feature importance.
Partial Dependence Curve-based Metric: Assesses the simplicity of the model's response as a function of individual features, using the second derivative of the partial dependence curve to measure non-linearity.
Surrogacy Model-based Metric: Evaluates the efficacy of simple, interpretable models (surrogates) in approximating the predictions of more complex models.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.