An extensive study on pre-trained models for program understanding and generation

Z Zeng, H Tan, H Zhang, J Li, Y Zhang… - Proceedings of the 31st …, 2022 - dl.acm.org
Automatic program understanding and generation techniques could significantly advance
the productivity of programmers and have been widely studied by academia and industry …

Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit

Y Wan, Z Bi, Y He, J Zhang, H Zhang, Y Sui… - ACM Computing …, 2024 - dl.acm.org
Code intelligence leverages machine learning techniques to extract knowledge from
extensive code corpora, with the aim of developing intelligent tools to improve the quality …

Pitfalls in language models for code intelligence: A taxonomy and survey

X She, Y Liu, Y Zhao, Y He, L Li… - arXiv preprint arXiv …, 2023 - arxiv.org
Modern language models (LMs) have been successfully employed in source code
generation and understanding, leading to a significant increase in research focused on …

Self-supervised bug detection and repair

M Allamanis, H Jackson-Flux… - Advances in Neural …, 2021 - proceedings.neurips.cc
Abstract Machine learning-based program analyses have recently shown the promise of
integrating formal and probabilistic reasoning towards aiding software development …

Code prediction by feeding trees to transformers

S Kim, J Zhao, Y Tian, S Chandra - 2021 IEEE/ACM 43rd …, 2021 - ieeexplore.ieee.org
Code prediction, more specifically autocomplete, has become an essential feature in
modern IDEs. Autocomplete is more effective when the desired next token is at (or close to) …

Bridging pre-trained models and downstream tasks for source code understanding

D Wang, Z Jia, S Li, Y Yu, Y Xiong, W Dong… - Proceedings of the 44th …, 2022 - dl.acm.org
With the great success of pre-trained models, the pretrain-then-finetune paradigm has been
widely adopted on downstream tasks for source code understanding. However, compared to …

Contrabert: Enhancing code pre-trained models via contrastive learning

S Liu, B Wu, X Xie, G Meng, Y Liu - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
Large-scale pre-trained models such as CodeBERT, GraphCodeBERT have earned
widespread attention from both academia and industry. Attributed to the superior ability in …

Adversarial examples for models of code

N Yefet, U Alon, E Yahav - Proceedings of the ACM on Programming …, 2020 - dl.acm.org
Neural models of code have shown impressive results when performing tasks such as
predicting method names and identifying certain kinds of bugs. We show that these models …

You see what i want you to see: poisoning vulnerabilities in neural code search

Y Wan, S Zhang, H Zhang, Y Sui, G Xu, D Yao… - Proceedings of the 30th …, 2022 - dl.acm.org
Searching and reusing code snippets from open-source software repositories based on
natural-language queries can greatly improve programming productivity. Recently, deep …

Cctest: Testing and repairing code completion systems

Z Li, C Wang, Z Liu, H Wang, D Chen… - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
Code completion, a highly valuable topic in the software development domain, has been
increasingly promoted for use by recent advances in large language models (LLMs). To …