A Survey of Deep Learning Interpretability Methods: Current Status and Challenges

Suyang Wu

doi:10.23977/acss.2026.100105

A Survey of Deep Learning Interpretability Methods: Current Status and Challenges

Download as PDF

DOI: 10.23977/acss.2026.100105 | Downloads: 1 | Views: 43

Author(s)

Suyang Wu ¹

Affiliation(s)

¹ School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, China

Corresponding Author

Suyang Wu

ABSTRACT

Deep learning models have demonstrated excellent performance in numerous fields such as image recognition, natural language processing, and medical diagnosis. However, due to their complex network structures and nonlinear mapping mechanisms, they exhibit significant "black box" problems, which restrict their reliable application in high-risk domains. This paper systematically combs through the core value and development history of deep learning interpretability research, classifies existing interpretability methods into three major categories: feature visualization-based methods, model decomposition-based methods, and causal inference-based methods, deeply analyzes the core principles, applicable scenarios, advantages and disadvantages of each type of method, focuses on discussing the application requirements of interpretability in high-risk fields such as medical care and finance, and finally looks forward to potential breakthrough directions such as the integration of causal and statistical interpretation frameworks in the future. The research aims to provide a comprehensive overview of the current status and directional guidance for deep learning interpretability research, and help promote the credible development of deep learning models.

KEYWORDS

Deep Learning; Interpretability; Black Box Problem; Causal Inference; High-Risk Domains

CITE THIS PAPER

Suyang Wu. A Survey of Deep Learning Interpretability Methods: Current Status and Challenges. Advances in Computer, Signals and Systems (2026) Vol. 10: 39-46. DOI: http://dx.doi.org/10.23977/acss.2026.100105.

REFERENCES

[1] Hong, Xiangyu, et al. "DePass: Unified Feature Attributing by Simple Decomposed Forward Pass." arXiv preprint arXiv:2510.18462 (2025).
[2] Narendra, Tanmayee, et al. "Explaining deep learning models using causal inference." arXiv preprint arXiv:1811.04376 (2018).
[3] Clement, Frincy, Ji Yang, and Irene Cheng. "Feature CAM: interpretable ai in image classification." arXiv preprint arXiv:2403.05658 (2024).
[4] Taylor-Melanson, Will, Zahra Sadeghi, and Stan Matwin. "Causal generative explainers using counterfactual inference: a case study on the Morpho-MNIST dataset." Pattern Analysis and Applications 27.3 (2024): 89.
[5] Cheng, Yuxiao, et al. "Causally-informed Deep Learning towards Explainable and Generalizable Outcomes Prediction in Critical Care." arXiv preprint arXiv:2502.02109 (2025).
[6] Koch, Bernard J., et al. "A Primer on Deep Learning for Causal Inference." Sociological Methods & Research 54.2 (2025): 397-447.

Subscription

E-Mail Alert

Downloads:	43260
Visits:	948342

A Survey of Deep Learning Interpretability Methods: Current Status and Challenges

Author(s)

Affiliation(s)

Corresponding Author

ABSTRACT

KEYWORDS

CITE THIS PAPER

REFERENCES

RESOURCES

JOIN US

PUBLICATION SERVICES

CONTACT US