Natural attack for pre-trained models of code

Z Yang, J Shi, J He, D Lo - … of the 44th International Conference on …, 2022 - dl.acm.org
Pre-trained models of code have achieved success in many important software engineering
tasks. However, these powerful models are vulnerable to adversarial attacks that slightly …

Natgen: generative pre-training by “naturalizing” source code

S Chakraborty, T Ahmed, Y Ding, PT Devanbu… - Proceedings of the 30th …, 2022 - dl.acm.org
Pre-trained Generative Language models (eg, PLBART, CodeT5, SPT-Code) for source
code yielded strong results on several tasks in the past few years, including code generation …

Extending source code pre-trained language models to summarise decompiled binaries

A Al-Kaswan, T Ahmed, M Izadi… - … on Software Analysis …, 2023 - ieeexplore.ieee.org
Binary reverse engineering is used to understand and analyse programs for which the
source code is unavailable. Decompilers can help, transforming opaque binaries into a …

Beware of the unexpected: Bimodal taint analysis

YW Chow, M Schäfer, M Pradel - Proceedings of the 32nd ACM …, 2023 - dl.acm.org
Static analysis is a powerful tool for detecting security vulnerabilities and other programming
problems. Global taint tracking, in particular, can spot vulnerabilities arising from …

Can static analysis tools find more defects? a qualitative study of design rule violations found by code review

S Mehrpour, TD LaToza - Empirical Software Engineering, 2023 - Springer
Static analysis tools find defects in code, checking code against rules to reveal potential
defects. Many studies have evaluated these tools by measuring their ability to detect known …

AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation

Z Sun, X Du, Z Yang, L Li, D Lo - Proceedings of the 33rd ACM SIGSOFT …, 2024 - dl.acm.org
Artificial Intelligence (AI) models have emerged as another important audience for
programming languages alongside humans and machines, as we enter the era of large …

Enriching source code with contextual data for code completion models: An empirical study

T van Dam, M Izadi… - 2023 IEEE/ACM 20th …, 2023 - ieeexplore.ieee.org
Transformer-based pre-trained models have recently achieved great results in solving many
software engineering tasks including automatic code completion which is a staple in a …

CONCORD: clone-aware contrastive learning for source code

Y Ding, S Chakraborty, L Buratti, S Pujar… - Proceedings of the …, 2023 - dl.acm.org
Deep Learning (DL) models to analyze source code have shown immense promise during
the past few years. More recently, self-supervised pre-training has gained traction for …

Towards code watermarking with dual-channel transformations

B Yang, W Li, L Xiang, B Li - arXiv preprint arXiv:2309.00860, 2023 - arxiv.org
The expansion of the open source community and the rise of large language models have
raised ethical and security concerns on the distribution of source code, such as misconduct …

SrcMarker: Dual-Channel Source Code Watermarking via Scalable Code Transformations

B Yang, W Li, L Xiang, B Li - 2024 IEEE Symposium on Security and …, 2024 - computer.org
The expansion of the open source community and the rise of large language models have
raised ethical and security concerns on the distribution of source code, such as misconduct …