Competeai: Understanding the competition behaviors in large language model-based agents Q Zhao, J Wang, Y Zhang, Y Jin, K Zhu, H Chen, X Xie arXiv preprint arXiv:2310.17512, 2023 | 44 | 2023 |
Promptbench: A unified library for evaluation of large language models K Zhu, Q Zhao, H Chen, J Wang, X Xie Journal of Machine Learning Research 25 (254), 1-22, 2024 | 21 | 2024 |
AgentReview: Exploring Peer Review Dynamics with LLM Agents Y Jin, Q Zhao, Y Wang, H Chen, K Zhu, Y Xiao, J Wang arXiv preprint arXiv:2406.12708, 2024 | 11 | 2024 |
Dyval 2: Dynamic evaluation of large language models by meta probing agents K Zhu, J Wang, Q Zhao, R Xu, X Xie arXiv preprint arXiv:2402.14865, 2024 | 8 | 2024 |
CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents Q Zhao, J Wang, Y Zhang, Y Jin, K Zhu, H Chen, X Xie Forty-first International Conference on Machine Learning, 0 | 8 | |
Dynamic Evaluation of Large Language Models by Meta Probing Agents K Zhu, J Wang, Q Zhao, R Xu, X Xie Forty-first International Conference on Machine Learning, 2024 | 6 | 2024 |