Recent advances in convolutional neural networks

J Gu, Z Wang, J Kuen, L Ma, A Shahroudy, B Shuai… - Pattern recognition, 2018 - Elsevier
In the last few years, deep learning has led to very good performance on a variety of
problems, such as visual recognition, speech recognition and natural language processing …

Deep learning for retail product recognition: Challenges and techniques

Y Wei, S Tran, S Xu, B Kang… - Computational …, 2020 - Wiley Online Library
Taking time to identify expected products and waiting for the checkout in a retail store are
common scenes we all encounter in our daily lives. The realization of automatic product …

With a little help from my friends: Nearest-neighbor contrastive learning of visual representations

D Dwibedi, Y Aytar, J Tompson… - Proceedings of the …, 2021 - openaccess.thecvf.com
Self-supervised learning algorithms based on instance discrimination train encoders to be
invariant to pre-defined transformations of the same instance. While most methods treat …

A bottom-up clustering approach to unsupervised person re-identification

Y Lin, X Dong, L Zheng, Y Yan, Y Yang - … of the AAAI conference on artificial …, 2019 - aaai.org
Most person re-identification (re-ID) approaches are based on supervised learning, which
requires intensive manual annotation for training data. However, it is not only …

Scaling and benchmarking self-supervised visual representation learning

P Goyal, D Mahajan, A Gupta… - Proceedings of the ieee …, 2019 - openaccess.thecvf.com
Self-supervised learning aims to learn representations from the data itself without explicit
manual supervision. Existing efforts ignore a crucial aspect of self-supervised learning-the …

Interpretable convolutional neural networks

Q Zhang, YN Wu, SC Zhu - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
This paper proposes a method to modify a traditional convolutional neural network (CNN)
into an interpretable CNN, in order to clarify knowledge representations in high conv-layers …

InLoc: Indoor visual localization with dense matching and view synthesis

H Taira, M Okutomi, T Sattler… - Proceedings of the …, 2018 - openaccess.thecvf.com
We seek to predict the 6 degree-of-freedom (6DoF) pose of a query photograph with respect
to a large indoor 3D map. The contributions of this work are three-fold. First, we develop a …

Training region-based object detectors with online hard example mining

A Shrivastava, A Gupta, R Girshick - Proceedings of the IEEE …, 2016 - cv-foundation.org
The field of object detection has made significant advances riding on the wave of region-
based ConvNets, but their training procedure still includes many heuristics and …

Unsupervised representation learning by sorting sequences

HY Lee, JB Huang, M Singh… - Proceedings of the IEEE …, 2017 - openaccess.thecvf.com
We present an unsupervised representation learning approach using videos without
semantic labels. We leverage the temporal coherence as a supervisory signal by formulating …

Shuffle and learn: unsupervised learning using temporal order verification

I Misra, CL Zitnick, M Hebert - … , The Netherlands, October 11–14, 2016 …, 2016 - Springer
In this paper, we present an approach for learning a visual representation from the raw
spatiotemporal signals in videos. Our representation is learned without supervision from …