Recent advances of generative adversarial networks in computer vision
YJ Cao, LL Jia, YX Chen, N Lin, C Yang, B Zhang… - IEEE …, 2018 - ieeexplore.ieee.org
The appearance of generative adversarial networks (GAN) provides a new approach and
framework for computer vision. Compared with traditional machine learning algorithms, GAN …
framework for computer vision. Compared with traditional machine learning algorithms, GAN …
A survey on video moment localization
Video moment localization, also known as video moment retrieval, aims to search a target
segment within a video described by a given natural language query. Beyond the task of …
segment within a video described by a given natural language query. Beyond the task of …
EEG emotion recognition using fusion model of graph convolutional neural networks and LSTM
In recent years, graph convolutional neural networks have become research focus and
inspired new ideas for emotion recognition based on EEG. Deep learning has been widely …
inspired new ideas for emotion recognition based on EEG. Deep learning has been widely …
Deep multimodal representation learning: A survey
W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …
On data augmentation for GAN training
Recent successes in Generative Adversarial Networks (GAN) have affirmed the importance
of using more data in GAN training. Yet it is expensive to collect data in many domains such …
of using more data in GAN training. Yet it is expensive to collect data in many domains such …
STAT: Spatial-temporal attention mechanism for video captioning
Video captioning refers to automatic generate natural language sentences, which
summarize the video contents. Inspired by the visual attention mechanism of human beings …
summarize the video contents. Inspired by the visual attention mechanism of human beings …
Beyond rnns: Positional self-attention with co-attention for video question answering
Most of the recent progresses on visual question answering are based on recurrent neural
networks (RNNs) with attention. Despite the success, these models are often timeconsuming …
networks (RNNs) with attention. Despite the success, these models are often timeconsuming …
Hierarchical LSTMs with adaptive attention for visual captioning
Recent progress has been made in using attention based encoder-decoder framework for
image and video captioning. Most existing decoders apply the attention mechanism to every …
image and video captioning. Most existing decoders apply the attention mechanism to every …
Exploiting subspace relation in semantic labels for cross-modal hashing
Hashing methods have been extensively applied to efficient multimedia data indexing and
retrieval on account of the explosion of multimedia data. Cross-modal hashing usually …
retrieval on account of the explosion of multimedia data. Cross-modal hashing usually …
Object-aware aggregation with bidirectional temporal graph for video captioning
Video captioning aims to automatically generate natural language descriptions of video
content, which has drawn a lot of attention recent years. Generating accurate and fine …
content, which has drawn a lot of attention recent years. Generating accurate and fine …