Fusing multimodal data using recurrent neural networks
Embodiments relate to a system, program product, and method for employing deep learning
techniques to fuse data across modalities. A multi-modal data set is received, including a …
techniques to fuse data across modalities. A multi-modal data set is received, including a …
Method for video recognition capable of encoding spatial and temporal relationships of concepts using contextual features
JB Santos, VHC De Melo, WR Schwartz… - US Patent …, 2022 - Google Patents
The proposed invention aims at encoding contextual information for video analysis and
understanding, by encoding spatial and temporal relationships of objects and the main …
understanding, by encoding spatial and temporal relationships of objects and the main …