Skip to main navigation Skip to search Skip to main content

KSS-MoE: Knowledge Space Synergy Framework in Mixture of Experts for Continual Visual Instruction Tuning

  • Lingyun Song
  • , Ziyao Chen
  • , Kang Pan
  • , Xiaolin Han
  • , Xinbiao Gan
  • , Yudai Pan
  • , Xiaofan Sun
  • , Xiaoqi Wang
  • , Xuequn Shang
  • Northwestern Polytechnical University Xian
  • Zhejiang Normal University
  • National University of Defense Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multimodal Large Language Models (MLLMs) employing the Mixture-of-Experts (MoE) structure exhibit encouraging results in visual language tasks. However, they struggle with catastrophic forgetting due to a lack of effective collaboration among experts and negative transfer across tasks. This happens because the router typically employed in MoE for managing expert assignments is inadequate when there are significant shifts in data distribution across various tasks. A drop in the effectiveness of earlier tasks is caused by negative transfer, which occurs due to conflicts in shared knowledge between tasks, disturbing the knowledge already acquired. To address these issues, we propose the Knowledge Space Synergy Framework in Mixture of Experts (KSS-MoE) for Continual Visual Instruction Tuning (CVIT). It dynamically combines the knowledge subspaces of experts to improve the integration of fine-grained complementary knowledge and collaborative abilities of experts, thus addressing the limitations of the basic router. Furthermore, we introduce a general expert that maintains orthogonal subspaces for shared knowledge, enabling effective cross-task knowledge utilization while reducing negative transfer. Extensive experiments conducted on eight CVIT tasks confirm the excellence of KSS-MoE, showcasing its top-tier performance.

Original languageEnglish
Title of host publicationProceedings of the AAAI Conference on Artificial Intelligence
EditorsSven Koenig, Chad Jenkins, Matthew E. Taylor
PublisherAssociation for the Advancement of Artificial Intelligence
Pages25536-25544
Number of pages9
Edition30
ISBN (Print)9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067, 9781577359067
DOIs
StatePublished - 2026
Event40th AAAI Conference on Artificial Intelligence, AAAI 2026 - Singapore, Singapore
Duration: 20 Jan 202627 Jan 2026

Publication series

NameProceedings of the AAAI Conference on Artificial Intelligence
Number30
Volume40
ISSN (Print)2159-5399
ISSN (Electronic)2374-3468

Conference

Conference40th AAAI Conference on Artificial Intelligence, AAAI 2026
Country/TerritorySingapore
CitySingapore
Period20/01/2627/01/26

Fingerprint

Dive into the research topics of 'KSS-MoE: Knowledge Space Synergy Framework in Mixture of Experts for Continual Visual Instruction Tuning'. Together they form a unique fingerprint.

Cite this