PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement Z Wang, Y Huang, D Song, L Ma, T Zhang Proceedings of the CHI Conference on Human Factors in Computing Systems, 1-21, 2024 | 7 | 2024 |
DeepLens: Interactive Out-of-distribution Data Detection in NLP Models D Song, Z Wang, Y Huang, L Ma, T Zhang Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems …, 2023 | 6 | 2023 |
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward X Xie, J Song, Z Zhou, Y Huang, D Song, L Ma arXiv preprint arXiv:2404.08517, 2024 | 4 | 2024 |
DeepSeer: Interactive RNN Explanation and Debugging via State Abstraction Z Wang, Y Huang, D Song, L Ma, T Zhang Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems …, 2023 | 4 | 2023 |
LUNA: A Model-Based Universal Analysis Framework for Large Language Models D Song, X Xie, J Song, D Zhu, Y Huang, F Juefei-Xu, L Ma IEEE Transactions on Software Engineering, 2024 | 3 | 2024 |
TESTEVAL: Benchmarking Large Language Models for Test Case Generation W Wang, C Yang, Z Wang, Y Huang, Z Chu, D Song, L Zhang, AR Chen, ... arXiv preprint arXiv:2406.04531, 2024 | 3 | 2024 |
An Empirical Study of Code Generation Errors made by Large Language Models D Song, Z Zhou, Z Wang, Y Huang, S Chen, B Kou, L Ma, T Zhang The 7th Annual Symposium on Machine Programming Co-located with ESEC/FSE 2023, 2023 | 1 | 2023 |
LeCov: Multi-level Testing Criteria for Large Language Models X Xie, J Song, Y Huang, D Song, F Zhang, F Juefei-Xu, L Ma arXiv preprint arXiv:2408.10474, 2024 | | 2024 |
Where Do Large Language Models Fail When Generating Code? Z Wang, Z Zhou, D Song, Y Huang, S Chen, L Ma, T Zhang arXiv preprint arXiv:2406.08731, 2024 | | 2024 |