A unified framework for attributing model components, data, and training dynamics to model behavior.
neural-networks interpretability explainability training-dynamics data-attribution model-attribution kernel-machines
-
Updated
Jul 1, 2026 - Python