Review of large vision models and visual prompt engineering

Jiaqi Wang; Zhengliang Liu; Lin Zhao; Zihao Wu; Chong Ma; Sigang Yu; Haixing Dai; Qiushi Yang; Yiheng Liu; Songyao Zhang; Enze Shi; Yi Pan; Tuo Zhang; Dajiang Zhu; Xiang Li; Xi Jiang; Bao Ge; Yixuan Yuan; Dinggang Shen; Tianming Liu; Shu Zhang

doi:10.1016/j.metrad.2023.100047

Review of large vision models and visual prompt engineering

Jiaqi Wang, Zhengliang Liu, Lin Zhao, Zihao Wu, Chong Ma, Sigang Yu, Haixing Dai, Qiushi Yang, Yiheng Liu, Songyao Zhang, Enze Shi, Yi Pan, Tuo Zhang, Dajiang Zhu, Xiang Li, Xi Jiang, Bao Ge, Yixuan Yuan, Dinggang Shen, Tianming LiuShu Zhang

科研成果: 期刊稿件 › 文献综述 › 同行评审

75 引用（Scopus）

摘要

Visual prompt engineering is a fundamental methodology in the field of visual and image artificial general intelligence. As the development of large vision models progresses, the importance of prompt engineering becomes increasingly evident. Designing suitable prompts for specific visual tasks has emerged as a meaningful research direction. This review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering, exploring the latest advancements in visual prompt engineering. We present influential large models in the visual domain and a range of prompt engineering methods employed on these models. It is our hope that this review provides a comprehensive and systematic description of prompt engineering methods based on large visual models, offering valuable insights for future researchers in their exploration of this field.

源语言	英语
文章编号	100047
期刊	Meta-Radiology
卷	1
期	3
DOI	https://doi.org/10.1016/j.metrad.2023.100047
出版状态	已出版 - 11月 2023

访问文件

10.1016/j.metrad.2023.100047

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{7716da0c12d348d8b12ae670123bd712,

title = "Review of large vision models and visual prompt engineering",

abstract = "Visual prompt engineering is a fundamental methodology in the field of visual and image artificial general intelligence. As the development of large vision models progresses, the importance of prompt engineering becomes increasingly evident. Designing suitable prompts for specific visual tasks has emerged as a meaningful research direction. This review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering, exploring the latest advancements in visual prompt engineering. We present influential large models in the visual domain and a range of prompt engineering methods employed on these models. It is our hope that this review provides a comprehensive and systematic description of prompt engineering methods based on large visual models, offering valuable insights for future researchers in their exploration of this field.",

keywords = "Artificial general intelligence, Vision models, Visual prompt",

author = "Jiaqi Wang and Zhengliang Liu and Lin Zhao and Zihao Wu and Chong Ma and Sigang Yu and Haixing Dai and Qiushi Yang and Yiheng Liu and Songyao Zhang and Enze Shi and Yi Pan and Tuo Zhang and Dajiang Zhu and Xiang Li and Xi Jiang and Bao Ge and Yixuan Yuan and Dinggang Shen and Tianming Liu and Shu Zhang",

note = "Publisher Copyright: {\textcopyright} 2023 The Authors",

year = "2023",

month = nov,

doi = "10.1016/j.metrad.2023.100047",

language = "英语",

volume = "1",

journal = "Meta-Radiology",

issn = "2950-1628",

publisher = "KeAi Publishing Communications Ltd.",

number = "3",

}

TY - JOUR

T1 - Review of large vision models and visual prompt engineering

AU - Wang, Jiaqi

AU - Liu, Zhengliang

AU - Zhao, Lin

AU - Wu, Zihao

AU - Ma, Chong

AU - Yu, Sigang

AU - Dai, Haixing

AU - Yang, Qiushi

AU - Liu, Yiheng

AU - Zhang, Songyao

AU - Shi, Enze

AU - Pan, Yi

AU - Zhang, Tuo

AU - Zhu, Dajiang

AU - Li, Xiang

AU - Jiang, Xi

AU - Ge, Bao

AU - Yuan, Yixuan

AU - Shen, Dinggang

AU - Liu, Tianming

AU - Zhang, Shu

PY - 2023/11

Y1 - 2023/11

N2 - Visual prompt engineering is a fundamental methodology in the field of visual and image artificial general intelligence. As the development of large vision models progresses, the importance of prompt engineering becomes increasingly evident. Designing suitable prompts for specific visual tasks has emerged as a meaningful research direction. This review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering, exploring the latest advancements in visual prompt engineering. We present influential large models in the visual domain and a range of prompt engineering methods employed on these models. It is our hope that this review provides a comprehensive and systematic description of prompt engineering methods based on large visual models, offering valuable insights for future researchers in their exploration of this field.

AB - Visual prompt engineering is a fundamental methodology in the field of visual and image artificial general intelligence. As the development of large vision models progresses, the importance of prompt engineering becomes increasingly evident. Designing suitable prompts for specific visual tasks has emerged as a meaningful research direction. This review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering, exploring the latest advancements in visual prompt engineering. We present influential large models in the visual domain and a range of prompt engineering methods employed on these models. It is our hope that this review provides a comprehensive and systematic description of prompt engineering methods based on large visual models, offering valuable insights for future researchers in their exploration of this field.

KW - Artificial general intelligence

KW - Vision models

KW - Visual prompt

UR - http://www.scopus.com/inward/record.url?scp=85203198466&partnerID=8YFLogxK

U2 - 10.1016/j.metrad.2023.100047

DO - 10.1016/j.metrad.2023.100047

M3 - 文献综述

AN - SCOPUS:85203198466

SN - 2950-1628

VL - 1

JO - Meta-Radiology

JF - Meta-Radiology

IS - 3

M1 - 100047

ER -

Review of large vision models and visual prompt engineering

摘要

访问文件

其它文件与链接

指纹

引用此