Natural attack for pre-trained models of code
Pre-trained models of code have achieved success in many important software engineering
tasks. However, these powerful models are vulnerable to adversarial attacks that slightly …
tasks. However, these powerful models are vulnerable to adversarial attacks that slightly …
Natgen: generative pre-training by “naturalizing” source code
Pre-trained Generative Language models (eg, PLBART, CodeT5, SPT-Code) for source
code yielded strong results on several tasks in the past few years, including code generation …
code yielded strong results on several tasks in the past few years, including code generation …
Extending source code pre-trained language models to summarise decompiled binaries
Binary reverse engineering is used to understand and analyse programs for which the
source code is unavailable. Decompilers can help, transforming opaque binaries into a …
source code is unavailable. Decompilers can help, transforming opaque binaries into a …
Beware of the unexpected: Bimodal taint analysis
Static analysis is a powerful tool for detecting security vulnerabilities and other programming
problems. Global taint tracking, in particular, can spot vulnerabilities arising from …
problems. Global taint tracking, in particular, can spot vulnerabilities arising from …
Can static analysis tools find more defects? a qualitative study of design rule violations found by code review
S Mehrpour, TD LaToza - Empirical Software Engineering, 2023 - Springer
Static analysis tools find defects in code, checking code against rules to reveal potential
defects. Many studies have evaluated these tools by measuring their ability to detect known …
defects. Many studies have evaluated these tools by measuring their ability to detect known …
AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation
Artificial Intelligence (AI) models have emerged as another important audience for
programming languages alongside humans and machines, as we enter the era of large …
programming languages alongside humans and machines, as we enter the era of large …
Enriching source code with contextual data for code completion models: An empirical study
Transformer-based pre-trained models have recently achieved great results in solving many
software engineering tasks including automatic code completion which is a staple in a …
software engineering tasks including automatic code completion which is a staple in a …
CONCORD: clone-aware contrastive learning for source code
Deep Learning (DL) models to analyze source code have shown immense promise during
the past few years. More recently, self-supervised pre-training has gained traction for …
the past few years. More recently, self-supervised pre-training has gained traction for …
Towards code watermarking with dual-channel transformations
B Yang, W Li, L Xiang, B Li - arXiv preprint arXiv:2309.00860, 2023 - arxiv.org
The expansion of the open source community and the rise of large language models have
raised ethical and security concerns on the distribution of source code, such as misconduct …
raised ethical and security concerns on the distribution of source code, such as misconduct …
SrcMarker: Dual-Channel Source Code Watermarking via Scalable Code Transformations
The expansion of the open source community and the rise of large language models have
raised ethical and security concerns on the distribution of source code, such as misconduct …
raised ethical and security concerns on the distribution of source code, such as misconduct …