TY - JOUR
T1 - VNAS
T2 - Variational Neural Architecture Search
AU - Ma, Benteng
AU - Zhang, Jing
AU - Xia, Yong
AU - Tao, Dacheng
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
PY - 2024/9
Y1 - 2024/9
N2 - Differentiable neural architecture search delivers point estimation to the optimal architecture, which yields arbitrarily high confidence to the learned architecture. This approach thus suffers in calibration and robustness, in contrast with the maximum a posteriori estimation scheme. In this paper, we propose a novel Variational Neural Architecture Search (VNAS) method that estimates and exploits the weight variability in the following three steps. VNAS first learns the weight distribution through variational inference which minimizes the expected lower bound on the marginal likelihood of architecture using unbiased Monte Carlo gradient estimation. A group of optimal architecture candidates is then drawn according to the learned weight distribution with the complexity constraint. The optimal architecture is further inferred under a novel training-free architecture-performance estimator, designed to score the network architectures at initialization without training, which significantly reduces the computational cost of the optimal architecture estimator. Extensive experiments show that VNAS significantly outperforms the state-of-the-art methods in classification performance and adversarial robustness.
AB - Differentiable neural architecture search delivers point estimation to the optimal architecture, which yields arbitrarily high confidence to the learned architecture. This approach thus suffers in calibration and robustness, in contrast with the maximum a posteriori estimation scheme. In this paper, we propose a novel Variational Neural Architecture Search (VNAS) method that estimates and exploits the weight variability in the following three steps. VNAS first learns the weight distribution through variational inference which minimizes the expected lower bound on the marginal likelihood of architecture using unbiased Monte Carlo gradient estimation. A group of optimal architecture candidates is then drawn according to the learned weight distribution with the complexity constraint. The optimal architecture is further inferred under a novel training-free architecture-performance estimator, designed to score the network architectures at initialization without training, which significantly reduces the computational cost of the optimal architecture estimator. Extensive experiments show that VNAS significantly outperforms the state-of-the-art methods in classification performance and adversarial robustness.
KW - Image classification
KW - Neural architecture search
KW - Neural network
UR - http://www.scopus.com/inward/record.url?scp=85191090122&partnerID=8YFLogxK
U2 - 10.1007/s11263-024-02014-w
DO - 10.1007/s11263-024-02014-w
M3 - 文章
AN - SCOPUS:85191090122
SN - 0920-5691
VL - 132
SP - 3689
EP - 3713
JO - International Journal of Computer Vision
JF - International Journal of Computer Vision
IS - 9
ER -