Large language models for software engineering: A systematic literature review

X Hou, Y Zhao, Y Liu, Z Yang, K Wang, L Li… - ACM Transactions on …, 2024 - dl.acm.org
Large Language Models (LLMs) have significantly impacted numerous domains, including
Software Engineering (SE). Many recent publications have explored LLMs applied to …

A survey on deep learning for software engineering

Y Yang, X Xia, D Lo, J Grundy - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
In 2006, Geoffrey Hinton proposed the concept of training “Deep Neural Networks (DNNs)”
and an improved model training method to break the bottleneck of neural network …

Program synthesis with large language models

J Austin, A Odena, M Nye, M Bosma… - arXiv preprint arXiv …, 2021 - arxiv.org
This paper explores the limits of the current generation of large language models for
program synthesis in general purpose programming languages. We evaluate a collection of …

Graph neural networks: foundation, frontiers and applications

L Wu, P Cui, J Pei, L Zhao, X Guo - … of the 28th ACM SIGKDD Conference …, 2022 - dl.acm.org
The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …

SantaCoder: don't reach for the stars!

LB Allal, R Li, D Kocetkov, C Mou, C Akiki… - arXiv preprint arXiv …, 2023 - arxiv.org
The BigCode project is an open-scientific collaboration working on the responsible
development of large language models for code. This tech report describes the progress of …

MultiPL-E: a scalable and polyglot approach to benchmarking neural code generation

F Cassano, J Gouwar, D Nguyen… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Large language models have demonstrated the ability to generate both natural language
and programming language text. Although contemporary code generation models are …

An extensive study on pre-trained models for program understanding and generation

Z Zeng, H Tan, H Zhang, J Li, Y Zhang… - Proceedings of the 31st …, 2022 - dl.acm.org
Automatic program understanding and generation techniques could significantly advance
the productivity of programmers and have been widely studied by academia and industry …

Multi-task learning based pre-trained language model for code completion

F Liu, G Li, Y Zhao, Z Jin - Proceedings of the 35th IEEE/ACM …, 2020 - dl.acm.org
Code completion is one of the most useful features in the Integrated Development
Environments (IDEs), which can accelerate software development by suggesting the next …

The adverse effects of code duplication in machine learning models of code

M Allamanis - Proceedings of the 2019 ACM SIGPLAN International …, 2019 - dl.acm.org
The field of big code relies on mining large corpora of code to perform some learning task
towards creating better tools for software engineers. A significant threat to this approach was …

Perfection not required? Human-AI partnerships in code translation

JD Weisz, M Muller, S Houde, J Richards… - Proceedings of the 26th …, 2021 - dl.acm.org
Generative models have become adept at producing artifacts such as images, videos, and
prose at human-like levels of proficiency. New generative techniques, such as unsupervised …