Graphadapter: Tuning vision-language models with dual knowledge graph

X Sun, J Zhang, X Wu, H Cheng, Y Xiong… - arXiv preprint arXiv …, 2023 - arxiv.org

Artificial General Intelligence (AGI) has revolutionized numerous fields, yet its integration
with graph data, a cornerstone in our interconnected world, remains nascent. This paper …

被引用次数：41 相关文章所有 2 个版本

[PDF] arxiv.org

Knowledge graphs meet multi-modal learning: A comprehensive survey

Z Chen, Y Zhang, Y Fang, Y Geng, L Guo… - arXiv preprint arXiv …, 2024 - arxiv.org

Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the
semantic web community's exploration into multi-modal dimensions unlocking new avenues …

被引用次数：33 相关文章所有 2 个版本

[PDF] thecvf.com

Sed: Semantic-aware discriminator for image super-resolution

B Li, X Li, H Zhu, Y Jin, R Feng… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Generative Adversarial Networks (GANs) have been widely used to recover vivid
textures in image super-resolution (SR) tasks. In particular one discriminator is utilized to …

被引用次数：16 相关文章所有 3 个版本

[PDF] thecvf.com

Sf-iqa: Quality and similarity integration for ai generated image quality assessment

Z Yu, F Guan, Y Lu, X Li, Z Chen - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

In recent years the rapid development of Artificial Intelligence (AI) has facilitated the
widespread use of AI-Generated Images (AIGIs) a subset of Artificial Intelligence Generated …

被引用次数：7 相关文章

[PDF] thecvf.com

BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP

J Bai, K Gao, S Min, ST Xia, Z Li… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Contrastive Vision-Language Pre-training known as CLIP has shown promising
effectiveness in addressing downstream image recognition tasks. However recent works …

被引用次数：21 相关文章所有 5 个版本

[PDF] arxiv.org

G-retriever: Retrieval-augmented generation for textual graph understanding and question answering

X He, Y Tian, Y Sun, NV Chawla, T Laurent… - arXiv preprint arXiv …, 2024 - arxiv.org

Given a graph with textual attributes, we enable users tochat with their graph': that is, to ask
questions about the graph using a conversational interface. In response to a user's …

被引用次数：71 相关文章所有 2 个版本

[PDF] thecvf.com

Adapting visual-language models for generalizable anomaly detection in medical images

C Huang, A Jiang, J Feng, Y Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recent advancements in large-scale visual-language pre-trained models have led to
significant progress in zero-/few-shot anomaly detection within natural image domains …

被引用次数：16 相关文章所有 3 个版本

[PDF] thecvf.com

Kvq: Kwai video quality assessment for short-form videos

Y Lu, X Li, Y Pei, K Yuan, Q Xie, Y Qu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Short-form UGC video platforms like Kwai and TikTok have been an emerging and
irreplaceable mainstream media form thriving on user-friendly engagement and …

被引用次数：6 相关文章

[PDF] thecvf.com

Aigc-vqa: A holistic perception metric for aigc video quality assessment

Y Lu, X Li, B Li, Z Yu, F Guan, X Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com

With the development of generative models such as the diffusion model and auto-regressive
model AI-generated content (AIGC) is experiencing an explosive growth. Moreover existing …

被引用次数：5 相关文章

[PDF] thecvf.com

Not all prompts are secure: A switchable backdoor attack against pre-trained vision transfomers

S Yang, J Bai, K Gao, Y Yang, Y Li… - Proceedings of the …, 2024 - openaccess.thecvf.com

Given the power of vision transformers a new learning paradigm pre-training and then
prompting makes it more efficient and effective to address downstream visual recognition …

被引用次数：9 相关文章所有 4 个版本