Code authorship attribution: Methods and challenges

V Kalgutkar, R Kaur, H Gonzalez… - ACM Computing …, 2019 - dl.acm.org
Code authorship attribution is the process of identifying the author of a given code. With
increasing numbers of malware and advanced mutation techniques, the authors of malware …

Authorship attribution for neural text generation

A Uchendu, T Le, K Shu, D Lee - Proceedings of the 2020 …, 2020 - aclanthology.org
In recent years, the task of generating realistic short and long texts have made tremendous
advancements. In particular, several recently proposed neural network-based language …

Human factors in security research: Lessons learned from 2008-2018

M Kaur, M van Eeten, M Janssen, K Borgolte… - arXiv preprint arXiv …, 2021 - arxiv.org
Instead of only considering technology, computer security research now strives to also take
into account the human factor by studying regular users and, to a lesser extent, experts like …

Misleading authorship attribution of source code using adversarial learning

E Quiring, A Maier, K Rieck - 28th USENIX Security Symposium …, 2019 - usenix.org
In this paper, we present a novel attack against authorship attribution of source code. We
exploit that recent attribution methods rest on machine learning and thus can be deceived by …

Ropgen: Towards robust code authorship attribution via automatic coding style transformation

Z Li, G Chen, C Chen, Y Zou, S Xu - Proceedings of the 44th International …, 2022 - dl.acm.org
Source code authorship attribution is an important problem often encountered in
applications such as software forensics, bug fixing, and software quality analysis. Recent …

Robustness, security, privacy, explainability, efficiency, and usability of large language models for code

Z Yang, Z Sun, TZ Yue, P Devanbu, D Lo - arXiv preprint arXiv:2403.07506, 2024 - arxiv.org
Large language models for code (LLM4Code), which demonstrate strong performance (eg,
high accuracy) in processing source code, have significantly transformed software …

Authorship attribution of source code: A language-agnostic approach and applicability in software engineering

E Bogomolov, V Kovalenko, Y Rebryk… - Proceedings of the 29th …, 2021 - dl.acm.org
Authorship attribution (ie, determining who is the author of a piece of source code) is an
established research topic. State-of-the-art results for the authorship attribution problem look …

Identifying Authorship in Malicious Binaries: Features, Challenges & Datasets

J Gray, D Sgandurra, L Cavallaro… - ACM Computing …, 2024 - dl.acm.org
Attributing a piece of malware to its creator typically requires threat intelligence. Binary
attribution increases the level of difficulty as it mostly relies upon the ability to disassemble …

De‐anonymizing Ethereum blockchain smart contracts through code attribution

S Linoy, N Stakhanova, S Ray - International journal of network …, 2021 - Wiley Online Library
Blockchain users are identified by addresses (public keys), which cannot be easily linked
back to them without out‐of‐network information. This provides pseudo‐anonymity, which is …

PART: Pre-trained Authorship Representation Transformer

J Huertas-Tato, A Huertas-Garcia, A Martin… - arXiv preprint arXiv …, 2022 - arxiv.org
Authors writing documents imprint identifying information within their texts: vocabulary,
registry, punctuation, misspellings, or even emoji usage. Finding these details is very …