Skip to main navigation Skip to search Skip to main content

UQ-ViT: Harmonizing Extreme Activations with Hardware-Friendly Uniform Quantization in Vision Transformers

  • Northwestern Polytechnical University Xian
  • Chongqing University of Posts and Telecommunications

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Post-Training Quantization enables efficient Vision Transformer (ViTs) deployment with a small calibration data, and its prevalent use of uniform quantization harnesses AI accelerator matrix cores for high-speed inference. However, the application of uniform quantization is fundamentally challenged by the extreme non-uniformity of activation distri-butions.Specifically, the power-law nature of post-Softmax attention scores and the significant inter-channel variance in post-GELU activations create a dilemma for conventional quantization, as it struggles to preserve critical high-magnitude values without sacrificing overall precision. To resolve this core conflict, we introduce UQ-ViT (Uniform Quantization for Vision Transformers), a novel uniform quantization framework designed to reconcile high precision with hardware efficiency. Central to UQ-ViT are two operators: Dynamic Elimination of Maximum (DeMax) and Normalization Quantization (NormQuant). DeMax is a quantization operator for post-Softmax attention scores that utilizes uniform quantization. It dynamically eliminates and preserves dominant values, effectively mitigating quantization loss from the extreme values in the power-law distribution. NormQuant utilizes a per-channel quantization strategy during quantization and reverts to a per-tensor format for dequantization, achieving both high accuracy and computational efficiency. Crucially, it is applicable to any linear layer, enabling effective quantization of post-GELU activations in ViTs. Through extensive experiments on various ViTs and vision tasks, including image classification, object detection, and instance segmentation, we demonstrate that our proposed approach outperforms existing methods, achieving superior accuracy while ensuring hardware friendliness.

Original languageEnglish
Title of host publicationProceedings of the AAAI Conference on Artificial Intelligence
EditorsSven Koenig, Chad Jenkins, Matthew E. Taylor
PublisherAssociation for the Advancement of Artificial Intelligence
Pages22354-22362
Number of pages9
Edition27
ISBN (Print)9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067
DOIs
StatePublished - 2026
Event40th AAAI Conference on Artificial Intelligence, AAAI 2026 - Singapore, Singapore
Duration: 20 Jan 202627 Jan 2026

Publication series

NameProceedings of the AAAI Conference on Artificial Intelligence
Number27
Volume40
ISSN (Print)2159-5399
ISSN (Electronic)2374-3468

Conference

Conference40th AAAI Conference on Artificial Intelligence, AAAI 2026
Country/TerritorySingapore
CitySingapore
Period20/01/2627/01/26

Fingerprint

Dive into the research topics of 'UQ-ViT: Harmonizing Extreme Activations with Hardware-Friendly Uniform Quantization in Vision Transformers'. Together they form a unique fingerprint.

Cite this