A survey of machine learning for big code and naturalness

M Allamanis, ET Barr, P Devanbu… - ACM Computing Surveys …, 2018 - dl.acm.org
Research at the intersection of machine learning, programming languages, and software
engineering has recently taken important steps in proposing learnable probabilistic models …

Deep learning for source code modeling and generation: Models, applications, and challenges

THM Le, H Chen, MA Babar - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Deep Learning (DL) techniques for Natural Language Processing have been evolving
remarkably fast. Recently, the DL advances in language modeling, machine translation, and …

Unsupervised translation of programming languages

B Roziere, MA Lachaux… - Advances in neural …, 2020 - proceedings.neurips.cc
A transcompiler, also known as source-to-source translator, is a system that converts source
code from a high-level programming language (such as C++ or Python) to another …

Unsupervised translation of programming languages

MA Lachaux, B Roziere, L Chanussot… - arXiv preprint arXiv …, 2020 - arxiv.org
A transcompiler, also known as source-to-source translator, is a system that converts source
code from a high-level programming language (such as C++ or Python) to another …

Multi-lingual evaluation of code generation models

B Athiwaratkun, SK Gouda, Z Wang, X Li, Y Tian… - arXiv preprint arXiv …, 2022 - arxiv.org
We present new benchmarks on evaluation code generation models: MBXP and Multilingual
HumanEval, and MathQA-X. These datasets cover over 10 programming languages and are …

Leveraging automated unit tests for unsupervised code translation

B Roziere, JM Zhang, F Charton, M Harman… - arXiv preprint arXiv …, 2021 - arxiv.org
With little to no parallel data available for programming languages, unsupervised methods
are well-suited to source code translation. However, the majority of unsupervised machine …

Code translation with compiler representations

M Szafraniec, B Roziere, H Leather, F Charton… - arXiv preprint arXiv …, 2022 - arxiv.org
In this paper, we leverage low-level compiler intermediate representations (IR) to improve
code translation. Traditional transpilers rely on syntactic information and handcrafted rules …

Avatar: A parallel corpus for java-python program translation

WU Ahmad, MGR Tushar, S Chakraborty… - arXiv preprint arXiv …, 2021 - arxiv.org
Program translation refers to migrating source code from one programming language to
another. It has tremendous practical value in software development, as porting software …

Automated transpilation of imperative to functional code using neural-guided program synthesis

B Mariano, Y Chen, Y Feng, G Durrett… - Proceedings of the ACM on …, 2022 - dl.acm.org
While many mainstream languages such as Java, Python, and C# increasingly incorporate
functional APIs to simplify programming and improve parallelization/performance, there are …

Summarize and generate to back-translate: Unsupervised translation of programming languages

WU Ahmad, S Chakraborty, B Ray… - arXiv preprint arXiv …, 2022 - arxiv.org
Back-translation is widely known for its effectiveness in neural machine translation when
there is little to no parallel data. In this approach, a source-to-target model is coupled with a …