Skip to main navigation Skip to search Skip to main content

医学影像中的生成技术

Translated title of the contribution: Application of content generation in medical images
  • Northwestern Polytechnical University Xian

Research output: Contribution to journalArticlepeer-review

Abstract

Medical imaging,a crucial tool for medical actions,utilizes various imaging techniques to capture the internal structure and function of the human body. Common types of medical images include magnetic resonance imaging(MRI),computed tomography(CT),positron emission tomography(PET),plain X-rays,ultrasound,and optical imaging. The information obtained from these images varies due to differences in imaging principles. For example,MRI uses a strong magnetic field and radio waves to obtain images of the inside of the body and provides good information about soft tissues. CT uses X-rays and computerized processing to create images of cross-sections of internal body structures and is primarily used to image high-electron-density tissues(e. g. ,bone);however,its capability to contrast soft tissues is somewhat lim-ited. PET uses tracers labeled by radioisotopes to observe biological processes and functional activities within the body to image specific biological functions. At the same time,depending on the differences in imaging parameters and tracers,medical images of the same imaging type may also differ from different subtypes,such as T1- and T2-weighted MRI sequences,FDG-PET,and Aβ-PET. These medical imaging techniques provide visual information about the anatomical,physiological,and pathological states of the human body and play an important role in disease diagnosis,treatment,and prognosis prediction. Medical images of the same type or subtype are referred to as single modality,and medical images that contain different modalities are referred to as multiple modalities. Given that various types or subtypes of medical images respond to different information about the patients’body,multiple types/subtypes of medical images are often acquired to obtain more comprehensive information to improve diagnostic accuracy. However,multimodal image data acquisition faces difficulties such as long acquisition time,high cost,and possible increase in radiation dose. Therefore,generative techniques should be used for cross-modal medical image synthesis,that is,using medical images of one or some modalities to generate medical images of another or some other modalities. Although cross-modal medical image syn-thesis can facilitate multimodal image diagnosis,some technical challenges exist. For example,some information that can be captured in the target modality does not exist in the source modality due to different imaging principles of various imag-ing modalities. In this case,synthesized images of the target modality lack certain information,creating significant dispari-ties in diagnostic performance between synthesized and real images,which leads to the problem of clinical failure. At the same time,privacy and ethical issues also contribute to the high cost of acquiring high-quality multimodal medical image data and the problem of missing data in cross-modal medical image synthesis. In addition,differences in resolution,con-trast,and image quality between different modalities affect the consistency of image generation models during the genera-tion process. Addressing these inconsistencies in data across different modalities poses a significant challenge for cross-modal medical image synthesis. The computational complexity and generalization ability of the model also need to be con-sidered,as cross-modal medical image synthesis often requires complex models and many computational resources,which may limit the usefulness and scalability of cross-modal medical image synthesis methods. In addition to the training data that the model has already seen,whether the model can perform well on new or other different datasets should be consid-ered. Most researchers start from the model itself and improve the quality of the synthesized images by improving the repre-sentation ability of the model or designing task-specific constraints. These developed cross-modal medical image synthesis techniques have been applied to image acquisition,reconstruction,alignment,segmentation,detection,and diagnosis,bringing new ideas and methods to solve many problems. This paper focuses on cross-modal image synthesis techniques and applications in the field of medical imaging. We will introduce existing cross-modal medical image synthesis tech-niques from three aspects:traditional synthesis methods,deep learning-based synthesis methods,and task-driven synthe-sis methods. Traditional synthesis methods usually divide an image into multiple small blocks and encode each block into a representation vector. This is done by establishing a mapping between the paired block representation vectors of different modalities and then generating the corresponding target modality block based on the encoding of the source modality block. The random forest-based approach treats image synthesis as a regression problem,assuming that the value of the target modal block or its central region is the dependent variable of the source modal block. This relationship can be obtained through a regression model. Dictionary learning-based methods assume that there exists a dictionary for each modality,that each image block can be obtained from a sparse representation of the elements in the dictionary,and that the image blocks corresponding to different modalities have the same dictionary encoding. Compared with traditional methods,deep learning-based cross-modal image synthesis methods can directly use large-scale parametric models to build mappings from source modal images to target modal images in an end-to-end manner. These methods automatically extract the representa-tion features of an image or an image block in a data-driven manner without manually design the representation features. Given their ease of implementation and superior performance,deep learning-based techniques have become the leading method in cross-modal image synthesis. In this paper,we introduce them from simple CNN-based approach,encoder-decoder network-based approach,generative adversarial network approach,and diffusion model-based approach. Task-oriented cross-modal image synthesis methods consider that the synthesis task has a specific task bias and form a task-specific bias by adding a task-related design on the basis of a generalized technique. In this manner,the synthesized image preserves more information that contributes to the task and achieves a performance enhancement on the specific task. Such synthesis methods are presented in three categories:task-oriented biases,biases formed through network models,and image synthesis embedded in task models. Finally,we present the application scenarios of cross-modal medical image syn-thesis techniques and their application under their typical advantageous tasks.

Translated title of the contributionApplication of content generation in medical images
Original languageChinese (Traditional)
Pages (from-to)1985-2000
Number of pages16
JournalJournal of Image and Graphics
Volume30
Issue number6
DOIs
StatePublished - Jun 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Fingerprint

Dive into the research topics of 'Application of content generation in medical images'. Together they form a unique fingerprint.

Cite this