所有版本 - 学术资源搜索

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

CTAL: Pre-training cross-modal transformer for audio-and-language representations

H Li, Y Kang, T Liu, W Ding, Z Liu - arXiv preprint arXiv:2109.00181, 2021 - arxiv.org

Existing audio-language task-specific predictive approaches focus on building complicated
late-fusion mechanisms. However, these models are facing challenges of overfitting with …

被引用次数：15 相关文章

CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations

H Li, Y Kang, T Liu, W Ding, Z Liu - arXiv e-prints, 2021 - ui.adsabs.harvard.edu

Existing audio-language task-specific predictive approaches focus on building complicated
late-fusion mechanisms. However, these models are facing challenges of overfitting with …

CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations

H Li, W Ding, Y Kang, T Liu, Z Wu… - Proceedings of the 2021 …, 2021 - aclanthology.org

Existing audio-language task-specific predictive approaches focus on building complicated
late-fusion mechanisms. However, these models are facing challenges of overfitting with …

[PDF] archive.org

[PDF][PDF] CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations

H Li, W Ding, Y Kang, T Liu, Z Wu, Z Liu - scholar.archive.org

Existing approaches for audio-language taskspecific prediction focus on building
complicated late-fusion mechanisms. However, these models face challenges of overfitting …