跳到主要导航 跳到搜索 跳到主要内容

Linearized Relative Positional Encoding

  • Zhen Qin
  • , Weixuan Sun
  • , Kaiyue Lu
  • , Hui Deng
  • , Dongxu Li
  • , Xiaodong Han
  • , Yuchao Dai
  • , Lingpeng Kong
  • , Yiran Zhong
  • Australian National University
  • Northwestern Polytechnical University
  • The University of Hong Kong
  • Shanghai Artificial Intelligence Laboratory

科研成果: 期刊稿件文章同行评审

7 引用 (Scopus)

摘要

Relative positional encoding is widely used in vanilla and linear transformers to repre-sent positional information. However, existing encoding methods of a vanilla transformer are not always directly applicable to a linear transformer, because the latter requires a decomposition of the query and key representations into separate kernel functions. Never-theless, principles for designing encoding methods suitable for linear transformers remain understudied. In this work, we put together a variety of existing linear relative positional encoding approaches under a canonical form and further propose a family of linear relative positional encoding algorithms via unitary transformation. Our formulation leads to a principled framework that can be used to develop new relative positional encoding methods that preserve linear space-time complexity. Equipped with different models, the proposed linearized relative positional encoding (LRPE) family derives effective encoding for vari-ous applications. Experiments show that compared with existing methods, LRPE achieves state-of-the-art performance in language modeling, text classification, and image classifi-cation. Meanwhile, it emphasizes a general paradigm for designing broadly more relative positional encoding methods that are applicable to linear transformers.

源语言英语
期刊Transactions on Machine Learning Research
2023
出版状态已出版 - 1 9月 2023
已对外发布

指纹

探究 'Linearized Relative Positional Encoding' 的科研主题。它们共同构成独一无二的指纹。

引用此