A survey of machine learning for big code and naturalness
Research at the intersection of machine learning, programming languages, and software
engineering has recently taken important steps in proposing learnable probabilistic models …
engineering has recently taken important steps in proposing learnable probabilistic models …
Genetic improvement of software: a comprehensive survey
Genetic improvement (GI) uses automated search to find improved versions of existing
software. We present a comprehensive survey of this nascent field of research with a focus …
software. We present a comprehensive survey of this nascent field of research with a focus …
Large language models for software engineering: Survey and open problems
This paper provides a survey of the emerging area of Large Language Models (LLMs) for
Software Engineering (SE). It also sets out open research challenges for the application of …
Software Engineering (SE). It also sets out open research challenges for the application of …
Studying the usage of text-to-text transfer transformer to support code-related tasks
Deep learning (DL) techniques are gaining more and more attention in the software
engineering community. They have been used to support several code-related tasks, such …
engineering community. They have been used to support several code-related tasks, such …
An empirical study on learning bug-fixing patches in the wild via neural machine translation
Millions of open source projects with numerous bug fixes are available in code repositories.
This proliferation of software development histories can be leveraged to learn how to fix …
This proliferation of software development histories can be leveraged to learn how to fix …
Big code!= big vocabulary: Open-vocabulary models for source code
Statistical language modeling techniques have successfully been applied to large source
code corpora, yielding a variety of new software development tools, such as tools for code …
code corpora, yielding a variety of new software development tools, such as tools for code …
Deep learning code fragments for code clone detection
Code clone detection is an important problem for software maintenance and evolution. Many
approaches consider either structure or identifiers, but none of the existing detection …
approaches consider either structure or identifiers, but none of the existing detection …
Context-aware patch generation for better automated program repair
The effectiveness of search-based automated program repair is limited in the number of
correct patches that can be successfully generated. There are two causes of such limitation …
correct patches that can be successfully generated. There are two causes of such limitation …
Deepsim: deep learning code functional similarity
G Zhao, J Huang - Proceedings of the 2018 26th ACM joint meeting on …, 2018 - dl.acm.org
Measuring code similarity is fundamental for many software engineering tasks, eg, code
search, refactoring and reuse. However, most existing techniques focus on code syntactical …
search, refactoring and reuse. However, most existing techniques focus on code syntactical …
Learning to mine aligned code and natural language pairs from stack overflow
For tasks like code synthesis from natural language, code retrieval, and code
summarization, data-driven models have shown great promise. However, creating these …
summarization, data-driven models have shown great promise. However, creating these …