MAC: Masked Contrastive Pre-Training for Efficient Video-Text Retrieval
We present a simple yet effective end-to-end Video-language Pre-training (VidLP)
framework, Masked Contrastive Video-language Pre-training (MAC), for video-text retrieval …
framework, Masked Contrastive Video-language Pre-training (MAC), for video-text retrieval …