FedVQA: Personalized Federated Visual Question Answering over Heterogeneous Scenes
This paper presents a new setting for visual question answering (VQA) called personalized
federated VQA (FedVQA) that addresses the growing need for decentralization and data …
federated VQA (FedVQA) that addresses the growing need for decentralization and data …
Safeguarding data in multimodal ai: A differentially private approach to clip training
The surge in multimodal AI's success has sparked concerns over data privacy in vision-and-
language tasks. While CLIP has revolutionized multimodal learning through joint training on …
language tasks. While CLIP has revolutionized multimodal learning through joint training on …
[PDF][PDF] Exploring deep learning for multimodal understanding
M Lao - 2023 - scholarlypublications …
[14] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T.,
Louf, R., Funtowicz, M., et al.: Transformers: State-of-the-art natural language processing. In …
Louf, R., Funtowicz, M., et al.: Transformers: State-of-the-art natural language processing. In …