Building Intrinsically Interpretable Deep Neural Networks: A Survey

Wenyan Zhang, Lianmeng Jiao, Quan Pan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Deep neural network has achieved remarkable success in fields such as image classification and object detection, sometimes even outperforming humans, but the black-box nature of deep neural networks limits their application in areas where the reasons for decisions need to be known. The increasing demand for more transparent and reliable models has led to the emergence of explainable machine learning, and more and more researchers have turned their attention to the interpretability of deep neural networks in an attempt to explore the inference process of the model by investigating the black-box properties of the network. Based on the different stages of explanation generation, we can broadly classify interpretable neural networks into two categories: post-hoc interpretable models and intrinsically interpretable models. In recent years, there have been numerous researches on interpretable neural networks, but nevertheless, there is still a lack of a unified classification and summary of the construction of intrinsically interpretable networks. In this paper, we review several typical approaches to building intrinsically interpretable neural networks in the field of image classification that have proposed in recent years, classify them according to the way they achieve interpretability, and summarize the strengths and weaknesses of each type of approach. Furthermore, we provide an outlook on future developments in this field.

Original languageEnglish
Title of host publicationProceedings - 2024 China Automation Congress, CAC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6835-6842
Number of pages8
ISBN (Electronic)9798350368604
DOIs
StatePublished - 2024
Event2024 China Automation Congress, CAC 2024 - Qingdao, China
Duration: 1 Nov 20243 Nov 2024

Publication series

NameProceedings - 2024 China Automation Congress, CAC 2024

Conference

Conference2024 China Automation Congress, CAC 2024
Country/TerritoryChina
CityQingdao
Period1/11/243/11/24

Keywords

  • explainable AI
  • image classification
  • Interpretable neural networks

Fingerprint

Dive into the research topics of 'Building Intrinsically Interpretable Deep Neural Networks: A Survey'. Together they form a unique fingerprint.

Cite this