Large language models for software engineering: A systematic literature review

X Hou, Y Zhao, Y Liu, Z Yang, K Wang, L Li… - ACM Transactions on …, 2024 - dl.acm.org
Large Language Models (LLMs) have significantly impacted numerous domains, including
Software Engineering (SE). Many recent publications have explored LLMs applied to …

Programming is hard-or at least it used to be: Educational opportunities and challenges of ai code generation

BA Becker, P Denny, J Finnie-Ansley… - Proceedings of the 54th …, 2023 - dl.acm.org
The introductory programming sequence has been the focus of much research in computing
education. The recent advent of several viable and freely-available AI-driven code …

Codegeex: A pre-trained model for code generation with multilingual evaluations on humaneval-x

Q Zheng, X Xia, X Zou, Y Dong, S Wang, Y Xue… - arXiv preprint arXiv …, 2023 - arxiv.org
Large pre-trained code generation models, such as OpenAI Codex, can generate syntax-
and function-correct code, making the coding of programmers more productive and our …

Coderl: Mastering code generation through pretrained models and deep reinforcement learning

H Le, Y Wang, AD Gotmare… - Advances in Neural …, 2022 - proceedings.neurips.cc
Program synthesis or code generation aims to generate a program that satisfies a problem
specification. Recent approaches using large-scale pretrained language models (LMs) have …

Impact of code language models on automated program repair

N Jiang, K Liu, T Lutellier, L Tan - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
Automated program repair (APR) aims to help developers improve software reliability by
generating patches for buggy programs. Although many code language models (CLM) are …

Lever: Learning to verify language-to-code generation with execution

A Ni, S Iyer, D Radev, V Stoyanov… - International …, 2023 - proceedings.mlr.press
The advent of large language models trained on code (code LLMs) has led to significant
progress in language-to-code generation. State-of-the-art approaches in this area combine …

DS-1000: A natural and reliable benchmark for data science code generation

Y Lai, C Li, Y Wang, T Zhang, R Zhong… - International …, 2023 - proceedings.mlr.press
We introduce DS-1000, a code generation benchmark with a thousand data science
problems spanning seven Python libraries, such as Numpy and Pandas. Compared to prior …

[HTML][HTML] Progen2: exploring the boundaries of protein language models

E Nijkamp, JA Ruffolo, EN Weinstein, N Naik, A Madani - Cell systems, 2023 - cell.com
Attention-based models trained on protein sequences have demonstrated incredible
success at classification and generation tasks relevant for artificial-intelligence-driven …

Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models

C Lemieux, JP Inala, SK Lahiri… - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
Search-based software testing (SBST) generates high-coverage test cases for programs
under test with a combination of test case generation and mutation. SBST's performance …

Automatically auditing large language models via discrete optimization

E Jones, A Dragan, A Raghunathan… - International …, 2023 - proceedings.mlr.press
Auditing large language models for unexpected behaviors is critical to preempt catastrophic
deployments, yet remains challenging. In this work, we cast auditing as an optimization …