所有版本 - 学术资源搜索

Automatic hallucination assessment for aligned large language models via transferable adversarial attacks

X Yu, H Cheng, X Liu, D Roth, J Gao - arXiv preprint arXiv:2310.12516, 2023 - arxiv.org

Although remarkable progress has been achieved in preventing large language model
(LLM) hallucinations using instruction tuning and retrieval augmentation, it remains …

被引用次数：13 相关文章

Automatic Hallucination Assessment for Aligned Large Language Models via Transferable Adversarial Attacks

X Yu, H Cheng, X Liu, D Roth, J Gao - arXiv e-prints, 2023 - ui.adsabs.harvard.edu

Although remarkable progress has been achieved in preventing large language model
(LLM) hallucinations using instruction tuning and retrieval augmentation, it remains …

Automatic Hallucination Assessment for Aligned Large Language Models via Transferable Adversarial Attacks

X Yu, H Cheng, X Liu, D Roth, J Gao - openreview.net

Although remarkable progress has been achieved preventing LLMs hallucinations, using
instruction tuning and retrieval augmentation, it is currently difficult to measure the reliability …

Automatic Hallucination Assessment for Aligned Large Language Models via Transferable Adversarial Attacks

X Yu, H Cheng, X Liu, D Roth, J Gao - R0-FoMo: Robustness of Few-shot … - openreview.net

Although remarkable progress has been achieved preventing LLMs hallucinations, using
instruction tuning and retrieval augmentation, it is currently difficult to measure the reliability …