Asif: Coupled data turns unimodal models to multimodal without training

A Norelli, M Fumero, V Maiorca… - Advances in …, 2023 - proceedings.neurips.cc
CLIP proved that aligning visual and language spaces is key to solving many vision tasks
without explicit training, but required to train image and text encoders from scratch on a huge …

Learning by Self-Explaining

W Stammer, F Friedrich, D Steinmann, H Shindo… - arXiv preprint arXiv …, 2023 - arxiv.org
Artificial intelligence (AI) research has a long track record of drawing inspirations from
findings from biology, in particular human intelligence. In contrast to current AI research that …