A survey on evaluation of large language models

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM Transactions on …, 2024 - dl.acm.org
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

A survey on large language model based autonomous agents

L Wang, C Ma, X Feng, Z Zhang, H Yang… - Frontiers of Computer …, 2024 - Springer
Autonomous agents have long been a research focus in academic and industry
communities. Previous research often focuses on training agents with limited knowledge …

Incharacter: Evaluating personality fidelity in role-playing agents through psychological interviews

X Wang, Y Xiao, J Huang, S Yuan, R Xu… - Proceedings of the …, 2024 - aclanthology.org
Abstract Role-playing agents (RPAs), powered by large language models, have emerged as
a flourishing field of applications. However, a key challenge lies in assessing whether RPAs …

All languages matter: On the multilingual safety of large language models

W Wang, Z Tu, C Chen, Y Yuan, J Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
Safety lies at the core of developing and deploying large language models (LLMs).
However, previous safety benchmarks only concern the safety in one language, eg the …

Chatgpt an enfj, bard an istj: Empirical study on personalities of large language models

J Huang, W Wang, MH Lam, EJ Li, W Jiao… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have made remarkable advancements in the field of
artificial intelligence, significantly reshaping the human-computer interaction. We not only …

Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench

J Huang, W Wang, EJ Li, MH Lam, S Ren… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently showcased their remarkable capacities, not
only in natural language processing tasks but also across diverse domains such as clinical …

Large language models meet nlp: A survey

L Qin, Q Chen, X Feng, Y Wu, Y Zhang, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
While large language models (LLMs) like ChatGPT have shown impressive capabilities in
Natural Language Processing (NLP) tasks, a systematic investigation of their potential in this …

From persona to personalization: A survey on role-playing language agents

J Chen, X Wang, R Xu, S Yuan, Y Zhang, W Shi… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advancements in large language models (LLMs) have significantly boosted the rise
of Role-Playing Language Agents (RPLAs), ie, specialized AI systems designed to simulate …

The good, the bad, and why: Unveiling emotions in generative ai

C Li, J Wang, Y Zhang, K Zhu, X Wang, W Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Emotion significantly impacts our daily behaviors and interactions. While recent generative
AI models, such as large language models, have shown impressive performance in various …

[HTML][HTML] A Generative Pretrained Transformer (GPT)–Powered Chatbot as a Simulated Patient to Practice History Taking: Prospective, Mixed Methods Study

F Holderried, C Stegemann–Philipps… - JMIR medical …, 2024 - mededu.jmir.org
Background: Communication is a core competency of medical professionals and of utmost
importance for patient safety. Although medical curricula emphasize communication training …