Video super-resolution via dense non-local spatial-temporal convolutional network

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

In this paper, we present a novel end-to-end deep neural network for the problem of video super-resolution. In contrast to most previous methods where frames need to wrap for temporal alignment based on the estimated optical flow, we propose short-temporal and bidirectional long-temporal blocks to exploit the spatial-temporal dependencies existing in inter-frames. It can effectively model the sudden and smooth varying motions of videos and overcome the limitations of explicit motion estimation. In addition, by introducing dense feature concatenation, it provides an effective way to combine the low-level and high-level features for boosting the reconstruction of mid/high-frequency information as shown in our analysis and experiment. Furthermore, we present a region-level non-local feature enhancing structure, which captures the spatial-temporal correlations of any two positions and makes use of long-distance relevant information. Extensive evaluations and comparisons with the current state-of-the-art approaches demonstrate the effectiveness of the proposed framework.

Original languageEnglish
Pages (from-to)1-12
Number of pages12
JournalNeurocomputing
Volume403
DOIs
StatePublished - 25 Aug 2020

Keywords

  • ConvLSTM
  • Dense concatenation
  • Non-local
  • Video super-resolution

Fingerprint

Dive into the research topics of 'Video super-resolution via dense non-local spatial-temporal convolutional network'. Together they form a unique fingerprint.

Cite this