Natgen: generative pre-training by “naturalizing” source code
Pre-trained Generative Language models (eg, PLBART, CodeT5, SPT-Code) for source
code yielded strong results on several tasks in the past few years, including code generation …
code yielded strong results on several tasks in the past few years, including code generation …
Convergent representations of computer programs in human and artificial neural networks
What aspects of computer programs are represented by the human brain during
comprehension? We leverage brain recordings derived from functional magnetic resonance …
comprehension? We leverage brain recordings derived from functional magnetic resonance …
Dependency-aware code naturalness
Code naturalness, which captures repetitiveness and predictability in programming
languages, has proven valuable for various code-related tasks in software engineering …
languages, has proven valuable for various code-related tasks in software engineering …
CONCORD: clone-aware contrastive learning for source code
Deep Learning (DL) models to analyze source code have shown immense promise during
the past few years. More recently, self-supervised pre-training has gained traction for …
the past few years. More recently, self-supervised pre-training has gained traction for …
Naturally!: How breakthroughs in natural language processing can dramatically help developers
Taking advantage of the naturalness hypothesis for code, recent development, and research
has focused on applying machine learning (ML) techniques originally developed for natural …
has focused on applying machine learning (ML) techniques originally developed for natural …
Towards Understanding What Code Language Models Learned
Pre-trained language models are effective in a variety of natural language tasks, but it has
been argued their capabilities fall short of fully learning meaning or understanding …
been argued their capabilities fall short of fully learning meaning or understanding …
CodeScholar: Growing Idiomatic Code Examples
Programmers often search for usage examples for API methods. A tool that could generate
realistic, idiomatic, and contextual usage examples for one or more APIs would be …
realistic, idiomatic, and contextual usage examples for one or more APIs would be …
Demystifying and Assessing Code Understandability in Java Decompilation
R Qin, Y Xiong, Y Lu, M Pan - arXiv preprint arXiv:2409.20343, 2024 - arxiv.org
Decompilation, the process of converting machine-level code into readable source code,
plays a critical role in reverse engineering. Given that the main purpose of decompilation is …
plays a critical role in reverse engineering. Given that the main purpose of decompilation is …
Do Developers Present Proficient Code Snippets in Their README Files? An Analysis of PyPI Libraries in GitHub
A README file plays an essential role as the face of a software project and the initial point of
contact for developers in Open Source Software (OSS) projects. The code snippet ranks …
contact for developers in Open Source Software (OSS) projects. The code snippet ranks …
On the naturalness of fuzzer-generated code
RH Kambhamettu, J Billos, T Oluwaseun-Apo… - Proceedings of the 19th …, 2022 - dl.acm.org
Compiler fuzzing tools such as Csmith have uncovered many bugs in compilers by randomly
sampling programs from a generative model. The success of these tools is often attributed to …
sampling programs from a generative model. The success of these tools is often attributed to …