- 学术资源搜索

Combating misinformation in the age of llms: Opportunities and challenges

C Chen, K Shu - AI Magazine, 2023 - Wiley Online Library

Misinformation such as fake news and rumors is a serious threat for information ecosystems
and public trust. The emergence of large language models (LLMs) has great potential to …

被引用次数：81 相关文章所有 4 个版本

[PDF] arxiv.org

Sora: A review on background, technology, limitations, and opportunities of large vision models

Y Liu, K Zhang, Y Li, Z Yan, C Gao, R Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …

被引用次数：122 相关文章所有 2 个版本

[PDF] arxiv.org

Trustllm: Trustworthiness in large language models

L Sun, Y Huang, H Wang, S Wu, Q Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

被引用次数：149 相关文章所有 4 个版本

[PDF] arxiv.org

Low-resource languages jailbreak gpt-4

ZX Yong, C Menghini, SH Bach - arXiv preprint arXiv:2310.02446, 2023 - arxiv.org

AI safety training and red-teaming of large language models (LLMs) are measures to
mitigate the generation of unsafe content. Our work exposes the inherent cross-lingual …

被引用次数：111 相关文章所有 4 个版本

[PDF] arxiv.org

Salad-bench: A hierarchical and comprehensive safety benchmark for large language models

L Li, B Dong, R Wang, X Hu, W Zuo, D Lin… - arXiv preprint arXiv …, 2024 - arxiv.org

In the rapidly evolving landscape of Large Language Models (LLMs), ensuring robust safety
measures is paramount. To meet this crucial need, we propose\emph {SALAD-Bench}, a …

被引用次数：30 相关文章所有 2 个版本

[HTML] mlr.press

[HTML][HTML] Position: TrustLLM: Trustworthiness in Large Language Models

Y Huang, L Sun, H Wang, S Wu… - International …, 2024 - proceedings.mlr.press

Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …

被引用次数：4 相关文章

Factuality challenges in the era of large language models and opportunities for fact-checking

I Augenstein, T Baldwin, M Cha… - Nature Machine …, 2024 - nature.com

The emergence of tools based on large language models (LLMs), such as OpenAI's
ChatGPT and Google's Gemini, has garnered immense public attention owing to their …

被引用次数：2 相关文章

[PDF] purdue.edu

On large language models' resilience to coercive interrogation

Z Zhang, G Shen, G Tao, S Cheng… - 2024 IEEE Symposium on …, 2024 - computer.org

Abstract Large Language Models (LLMs) are increasingly employed in numerous
applications. It is hence important to ensure that their ethical standard aligns with humans' …

被引用次数：11 相关文章

[PDF] arxiv.org

Sorry-bench: Systematically evaluating large language model safety refusal behaviors

T Xie, X Qi, Y Zeng, Y Huang, UM Sehwag… - arXiv preprint arXiv …, 2024 - arxiv.org

Evaluating aligned large language models'(LLMs) ability to recognize and reject unsafe user
requests is crucial for safe, policy-compliant deployments. Existing evaluation efforts …

被引用次数：8 相关文章所有 3 个版本

[PDF] arxiv.org

The art of saying no: Contextual noncompliance in language models

F Brahman, S Kumar, V Balachandran, P Dasigi… - arXiv preprint arXiv …, 2024 - arxiv.org

Chat-based language models are designed to be helpful, yet they should not comply with
every user request. While most existing work primarily focuses on refusal of" unsafe" …

被引用次数：5 相关文章所有 2 个版本