Model-Driven Root Cause Analysis for Trustworthy AI: A Data-and-Model-Centric Explanation Framework
This program is tentative and subject to change.
Building trust in AI systems requires not only accurate models but also mechanisms to diagnose why machine learning (ML) pipelines succeed or fail. In this work, we propose a \textit{model-driven Root Cause Analysis (RCA) framework} that attributes pipeline performance to interpretable factors spanning both data properties and model configurations. Unlike post-hoc explainers that approximate black-box behavior, our approach learns a faithful, inherently interpretable meta-model using Explainable Boosting Machines (EBM) to capture the mapping from data complexity and hyperparameters to predictive accuracy.
To evaluate this framework, we curated a large-scale meta-dataset comprising 81,000 Decision Tree pipeline runs generated from 270 OpenML datasets combined with 300 hyperparameter configurations. The RCA meta-model achieved high predictive fidelity ($R^2 = 0.90$, MAE = 0.030), far outperforming a mean-regressor baseline ($R^2 \approx 0$). This fidelity ensures that feature attributions reflect genuine performance determinants rather than artifacts of an ill-fitting surrogate.
Beyond predictive accuracy, we assessed the attribution validity of our RCA framework through observational analysis of ten representative pipelines—five high-performing and five low-performing—drawn from the test set. Results show that the attributions are concise, with typically fewer than three dominant contributors per case, making them easy to interpret. In success cases, low class overlap and balanced distributions dominate attributions, while failure cases are driven by severe imbalance, harmful interactions, and, in some cases, context-dependent effects such as redundant dense features. Hyperparameter effects emerge as secondary but aggravating under challenging conditions. These findings demonstrate that our RCA framework provides theoretically grounded yet empirically adaptive explanations, enabling robust root cause analysis for trustworthy AI.
This program is tentative and subject to change.
Tue 7 OctDisplayed time zone: Eastern Time (US & Canada) change
10:30 - 12:00 | |||
10:30 30mTalk | Model-Driven Root Cause Analysis for Trustworthy AI: A Data-and-Model-Centric Explanation Framework SAM Conference | ||
11:00 30mTalk | A Real-Time Multi-modal Framework for Human-Centric Requirements Engineering in Autonomous Vehicles SAM Conference Farzaneh Kargozari Ontario Tech University - Faculty of Engineering and Applied Science - Electrical-Computer & Software Engineering, Sanaa Alwidian | ||
11:30 30mDay closing | Closing Ceremony: Final words, Best Paper Award SAM Conference |