Large language models for software engineering: A systematic literature review
Large Language Models (LLMs) have significantly impacted numerous domains, including
Software Engineering (SE). Many recent publications have explored LLMs applied to …
Software Engineering (SE). Many recent publications have explored LLMs applied to …
Programming is hard-or at least it used to be: Educational opportunities and challenges of ai code generation
The introductory programming sequence has been the focus of much research in computing
education. The recent advent of several viable and freely-available AI-driven code …
education. The recent advent of several viable and freely-available AI-driven code …
Codegeex: A pre-trained model for code generation with multilingual evaluations on humaneval-x
Large pre-trained code generation models, such as OpenAI Codex, can generate syntax-
and function-correct code, making the coding of programmers more productive and our …
and function-correct code, making the coding of programmers more productive and our …
Coderl: Mastering code generation through pretrained models and deep reinforcement learning
Program synthesis or code generation aims to generate a program that satisfies a problem
specification. Recent approaches using large-scale pretrained language models (LMs) have …
specification. Recent approaches using large-scale pretrained language models (LMs) have …
Impact of code language models on automated program repair
Automated program repair (APR) aims to help developers improve software reliability by
generating patches for buggy programs. Although many code language models (CLM) are …
generating patches for buggy programs. Although many code language models (CLM) are …
Lever: Learning to verify language-to-code generation with execution
The advent of large language models trained on code (code LLMs) has led to significant
progress in language-to-code generation. State-of-the-art approaches in this area combine …
progress in language-to-code generation. State-of-the-art approaches in this area combine …
DS-1000: A natural and reliable benchmark for data science code generation
We introduce DS-1000, a code generation benchmark with a thousand data science
problems spanning seven Python libraries, such as Numpy and Pandas. Compared to prior …
problems spanning seven Python libraries, such as Numpy and Pandas. Compared to prior …
[HTML][HTML] Progen2: exploring the boundaries of protein language models
Attention-based models trained on protein sequences have demonstrated incredible
success at classification and generation tasks relevant for artificial-intelligence-driven …
success at classification and generation tasks relevant for artificial-intelligence-driven …
Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models
Search-based software testing (SBST) generates high-coverage test cases for programs
under test with a combination of test case generation and mutation. SBST's performance …
under test with a combination of test case generation and mutation. SBST's performance …
Automatically auditing large language models via discrete optimization
Auditing large language models for unexpected behaviors is critical to preempt catastrophic
deployments, yet remains challenging. In this work, we cast auditing as an optimization …
deployments, yet remains challenging. In this work, we cast auditing as an optimization …