An overview of cross-media retrieval: Concepts, methodologies, benchmarks, and challenges
Multimedia retrieval plays an indispensable role in big data utilization. Past efforts mainly
focused on single-media retrieval. However, the requirements of users are highly flexible …
focused on single-media retrieval. However, the requirements of users are highly flexible …
A comprehensive survey on cross-modal retrieval
In recent years, cross-modal retrieval has drawn much attention due to the rapid growth of
multimodal data. It takes one type of data as the query to retrieve relevant data of another …
multimodal data. It takes one type of data as the query to retrieve relevant data of another …
Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video
We introduce a new large-scale data set of video URLs with densely-sampled object
bounding box annotations called YouTube-BoundingBoxes (YT-BB). The data set consists …
bounding box annotations called YouTube-BoundingBoxes (YT-BB). The data set consists …
A survey of multi-view representation learning
Recently, multi-view representation learning has become a rapidly growing direction in
machine learning and data mining areas. This paper introduces two categories for multi …
machine learning and data mining areas. This paper introduces two categories for multi …
Framing image description as a ranking task: Data, models and evaluation metrics
M Hodosh, P Young, J Hockenmaier - Journal of Artificial Intelligence …, 2013 - jair.org
The ability to associate images with natural language sentences that describe what is
depicted in them is a hallmark of image understanding, and a prerequisite for applications …
depicted in them is a hallmark of image understanding, and a prerequisite for applications …
[PDF][PDF] Adaptive subgradient methods for online learning and stochastic optimization.
We present a new family of subgradient methods that dynamically incorporate knowledge of
the geometry of the data observed in earlier iterations to perform more informative gradient …
the geometry of the data observed in earlier iterations to perform more informative gradient …
A multi-view embedding space for modeling internet images, tags, and their semantics
This paper investigates the problem of modeling Internet images and associated text or tags
for tasks such as image-to-image search, tag-to-image search, and image-to-tag search …
for tasks such as image-to-image search, tag-to-image search, and image-to-tag search …
A survey of approaches and trends in person re-identification
A Bedagkar-Gala, SK Shah - Image and vision computing, 2014 - Elsevier
Person re-identification is a fundamental task in automated video surveillance and has been
an area of intense research in the past few years. Given an image/video of a person taken …
an area of intense research in the past few years. Given an image/video of a person taken …
Local binary patterns and its application to facial image analysis: a survey
Local binary pattern (LBP) is a nonparametric descriptor, which efficiently summarizes the
local structures of images. In recent years, it has aroused increasing interest in many areas …
local structures of images. In recent years, it has aroused increasing interest in many areas …
Predicting visual features from text for image and video caption retrieval
This paper strives to find amidst a set of sentences the one best describing the content of a
given image or video. Different from existing works, which rely on a joint subspace for their …
given image or video. Different from existing works, which rely on a joint subspace for their …