Large language models for software engineering: A systematic literature review
Large Language Models (LLMs) have significantly impacted numerous domains, including
Software Engineering (SE). Many recent publications have explored LLMs applied to …
Software Engineering (SE). Many recent publications have explored LLMs applied to …
Software testing with large language models: Survey, landscape, and vision
Pre-trained large language models (LLMs) have recently emerged as a breakthrough
technology in natural language processing and artificial intelligence, with the ability to …
technology in natural language processing and artificial intelligence, with the ability to …
Large language models for data annotation: A survey
Data annotation generally refers to the labeling or generating of raw data with relevant
information, which could be used for improving the efficacy of machine learning models. The …
information, which could be used for improving the efficacy of machine learning models. The …
The robots are here: Navigating the generative ai revolution in computing education
Recent advancements in artificial intelligence (AI) and specifically generative AI (GenAI) are
threatening to fundamentally reshape computing and society. Largely driven by large …
threatening to fundamentally reshape computing and society. Largely driven by large …
Lampilot: An open benchmark dataset for autonomous driving with language model programs
Autonomous driving (AD) has made significant strides in recent years. However existing
frameworks struggle to interpret and execute spontaneous user instructions such as" …
frameworks struggle to interpret and execute spontaneous user instructions such as" …
Pangu-coder2: Boosting large language models for code with ranking feedback
B Shen, J Zhang, T Chen, D Zan, B Geng, A Fu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models for Code (Code LLM) are flourishing. New and powerful models
are released on a weekly basis, demonstrating remarkable performance on the code …
are released on a weekly basis, demonstrating remarkable performance on the code …
Evaluating instruction-tuned large language models on code comprehension and generation
In this work, we evaluate 10 open-source instructed LLMs on four representative code
comprehension and generation tasks. We have the following main findings. First, for the zero …
comprehension and generation tasks. We have the following main findings. First, for the zero …
Classeval: A manually-crafted benchmark for evaluating llms on class-level code generation
In this work, we make the first attempt to evaluate LLMs in a more challenging code
generation scenario, ie class-level code generation. We first manually construct the first …
generation scenario, ie class-level code generation. We first manually construct the first …
A survey of large language models for code: Evolution, benchmarking, and future trends
General large language models (LLMs), represented by ChatGPT, have demonstrated
significant potential in tasks such as code generation in software engineering. This has led …
significant potential in tasks such as code generation in software engineering. This has led …
Cruxeval: A benchmark for code reasoning, understanding and execution
We present CRUXEval (Code Reasoning, Understanding, and eXecution Evaluation), a
benchmark consisting of 800 Python functions (3-13 lines). Each function comes with an …
benchmark consisting of 800 Python functions (3-13 lines). Each function comes with an …