Large language models meet nl2code: A survey

X Hou, Y Zhao, Y Liu, Z Yang, K Wang, L Li… - ACM Transactions on …, 2023 - dl.acm.org

Large Language Models (LLMs) have significantly impacted numerous domains, including
Software Engineering (SE). Many recent publications have explored LLMs applied to …

被引用次数：416 相关文章所有 8 个版本

[PDF] arxiv.org

Software testing with large language models: Survey, landscape, and vision

J Wang, Y Huang, C Chen, Z Liu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Pre-trained large language models (LLMs) have recently emerged as a breakthrough
technology in natural language processing and artificial intelligence, with the ability to …

被引用次数：223 相关文章所有 7 个版本

[PDF] arxiv.org

Large language models for data annotation: A survey

Z Tan, D Li, S Wang, A Beigi, B Jiang… - arXiv preprint arXiv …, 2024 - arxiv.org

Data annotation generally refers to the labeling or generating of raw data with relevant
information, which could be used for improving the efficacy of machine learning models. The …

被引用次数：89 相关文章所有 2 个版本

[PDF] acm.org

The robots are here: Navigating the generative ai revolution in computing education

J Prather, P Denny, J Leinonen, BA Becker… - Proceedings of the …, 2023 - dl.acm.org

Recent advancements in artificial intelligence (AI) and specifically generative AI (GenAI) are
threatening to fundamentally reshape computing and society. Largely driven by large …

被引用次数：175 相关文章所有 7 个版本

[PDF] thecvf.com

Lampilot: An open benchmark dataset for autonomous driving with language model programs

Y Ma, C Cui, X Cao, W Ye, P Liu, J Lu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Autonomous driving (AD) has made significant strides in recent years. However existing
frameworks struggle to interpret and execute spontaneous user instructions such as" …

被引用次数：26 相关文章所有 4 个版本

[PDF] arxiv.org

Pangu-coder2: Boosting large language models for code with ranking feedback

B Shen, J Zhang, T Chen, D Zan, B Geng, A Fu… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models for Code (Code LLM) are flourishing. New and powerful models
are released on a weekly basis, demonstrating remarkable performance on the code …

被引用次数：70 相关文章所有 2 个版本

[PDF] arxiv.org

Evaluating instruction-tuned large language models on code comprehension and generation

Z Yuan, J Liu, Q Zi, M Liu, X Peng, Y Lou - arXiv preprint arXiv:2308.01240, 2023 - arxiv.org

In this work, we evaluate 10 open-source instructed LLMs on four representative code
comprehension and generation tasks. We have the following main findings. First, for the zero …

被引用次数：69 相关文章所有 5 个版本

[PDF] arxiv.org

Classeval: A manually-crafted benchmark for evaluating llms on class-level code generation

X Du, M Liu, K Wang, H Wang, J Liu, Y Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

In this work, we make the first attempt to evaluate LLMs in a more challenging code
generation scenario, ie class-level code generation. We first manually construct the first …

被引用次数：108 相关文章所有 3 个版本

[PDF] arxiv.org

A survey of large language models for code: Evolution, benchmarking, and future trends

Z Zheng, K Ning, Y Wang, J Zhang, D Zheng… - arXiv preprint arXiv …, 2023 - arxiv.org

General large language models (LLMs), represented by ChatGPT, have demonstrated
significant potential in tasks such as code generation in software engineering. This has led …

被引用次数：68 相关文章所有 2 个版本

[PDF] arxiv.org

Cruxeval: A benchmark for code reasoning, understanding and execution

A Gu, B Rozière, H Leather, A Solar-Lezama… - arXiv preprint arXiv …, 2024 - arxiv.org

We present CRUXEval (Code Reasoning, Understanding, and eXecution Evaluation), a
benchmark consisting of 800 Python functions (3-13 lines). Each function comes with an …

被引用次数：44 相关文章所有 5 个版本