Recent advances of generative adversarial networks in computer vision

YJ Cao, LL Jia, YX Chen, N Lin, C Yang, B Zhang… - IEEE …, 2018 - ieeexplore.ieee.org
The appearance of generative adversarial networks (GAN) provides a new approach and
framework for computer vision. Compared with traditional machine learning algorithms, GAN …

A survey on video moment localization

M Liu, L Nie, Y Wang, M Wang, Y Rui - ACM Computing Surveys, 2023 - dl.acm.org
Video moment localization, also known as video moment retrieval, aims to search a target
segment within a video described by a given natural language query. Beyond the task of …

EEG emotion recognition using fusion model of graph convolutional neural networks and LSTM

Y Yin, X Zheng, B Hu, Y Zhang, X Cui - Applied Soft Computing, 2021 - Elsevier
In recent years, graph convolutional neural networks have become research focus and
inspired new ideas for emotion recognition based on EEG. Deep learning has been widely …

Deep multimodal representation learning: A survey

W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …

On data augmentation for GAN training

NT Tran, VH Tran, NB Nguyen… - … on Image Processing, 2021 - ieeexplore.ieee.org
Recent successes in Generative Adversarial Networks (GAN) have affirmed the importance
of using more data in GAN training. Yet it is expensive to collect data in many domains such …

STAT: Spatial-temporal attention mechanism for video captioning

C Yan, Y Tu, X Wang, Y Zhang, X Hao… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
Video captioning refers to automatic generate natural language sentences, which
summarize the video contents. Inspired by the visual attention mechanism of human beings …

Beyond rnns: Positional self-attention with co-attention for video question answering

X Li, J Song, L Gao, X Liu, W Huang, X He… - Proceedings of the AAAI …, 2019 - ojs.aaai.org
Most of the recent progresses on visual question answering are based on recurrent neural
networks (RNNs) with attention. Despite the success, these models are often timeconsuming …

Hierarchical LSTMs with adaptive attention for visual captioning

L Gao, X Li, J Song, HT Shen - IEEE transactions on pattern …, 2019 - ieeexplore.ieee.org
Recent progress has been made in using attention based encoder-decoder framework for
image and video captioning. Most existing decoders apply the attention mechanism to every …

Exploiting subspace relation in semantic labels for cross-modal hashing

HT Shen, L Liu, Y Yang, X Xu, Z Huang… - … on Knowledge and …, 2020 - ieeexplore.ieee.org
Hashing methods have been extensively applied to efficient multimedia data indexing and
retrieval on account of the explosion of multimedia data. Cross-modal hashing usually …

Object-aware aggregation with bidirectional temporal graph for video captioning

J Zhang, Y Peng - Proceedings of the IEEE/CVF conference …, 2019 - openaccess.thecvf.com
Video captioning aims to automatically generate natural language descriptions of video
content, which has drawn a lot of attention recent years. Generating accurate and fine …