Summary-oriented vision modeling for multimodal abstractive summarization

Y Liang, F Meng, J Xu, J Wang, Y Chen… - arXiv preprint arXiv …, 2022 - arxiv.org
Multimodal abstractive summarization (MAS) aims to produce a concise summary given the
multimodal data (text and vision). Existing studies mainly focus on how to effectively use the …

DTV: Dual Knowledge Distillation and Target-oriented Vision Modeling for Many-to-Many Multimodal Summarization

Y Liang, F Meng, J Wang, J Xu, Y Chen… - arXiv preprint arXiv …, 2023 - arxiv.org
Many-to-many multimodal summarization (M $^ 3$ S) task aims to generate summaries in
any language with document inputs in any language and the corresponding image …

Cross-modal knowledge guided model for abstractive summarization

H Wang, J Liu, M Duan, P Gong, Z Wu, J Wang… - Complex & Intelligent …, 2024 - Springer
Abstractive summarization (AS) aims to generate more flexible and informative descriptions
than extractive summarization. Nevertheless, it often distorts or fabricates facts in the original …

Crisis event summary generative model based on hierarchical multimodal fusion

J Wang, S Yang, H Zhao - Pattern Recognition, 2023 - Elsevier
How to quickly obtain information about crisis events on social media such as Twitter and
Weibo is crucial for follow-up rescue work and the promotion of postdisaster reconstruction …

Scientific document processing: challenges for modern learning methods

A Ramesh Kashyap, Y Yang, MY Kan - International Journal on Digital …, 2023 - Springer
Neural network models enjoy success on language tasks related to Web documents,
including news and Wikipedia articles. However, the characteristics of scientific publications …

Inter-and intra-modal contrastive hybrid learning framework for multimodal abstractive summarization

J Li, Z Zhang, B Wang, Q Zhao, C Zhang - Entropy, 2022 - mdpi.com
Internet users are benefiting from technologies of abstractive summarization enabling them
to view articles on the internet by reading article summaries only instead of an entire article …

DIUSum: Dynamic Image Utilization for Multimodal Summarization

M Xiao, J Zhu, F Zhai, Y Zhou, C Zong - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Existing multimodal summarization approaches focus on fusing image features in the
encoding process, ignoring the individualized needs for images when generating different …

MCR: Multilayer cross‐fusion with reconstructor for multimodal abstractive summarisation

J Yuan, J Yun, B Zheng, L Jiao, L Liu - IET Computer Vision, 2023 - Wiley Online Library
Multimodal abstractive summarisation (MAS) aims to generate a textual summary from
multimodal data collection, such as video‐text pairs. Despite the success of recent work, the …

Align vision-language semantics by multi-task learning for multi-modal summarization

C Cui, X Liang, S Wu, Z Li - Neural Computing and Applications, 2024 - Springer
Most current multi-modal summarization methods follow a cascaded manner, where an off-
the-shelf object detector is first used to extract visual features. After that, these visual features …

MCLS: A large-scale multimodal cross-lingual summarization dataset

X Shi - China national conference on chinese computational …, 2023 - Springer
Multimodal summarization which aims to generate summaries with multimodal inputs, eg,
text and visual features, has attracted much attention in the research community. However …