Symlm: Predicting function names in stripped binaries via context-sensitive execution-aware code embeddings

X Jin, K Pei, JY Won, Z Lin - Proceedings of the 2022 ACM SIGSAC …, 2022 - dl.acm.org
Predicting function names in stripped binaries is an extremely useful but challenging task, as
it requires summarizing the execution behavior and semantics of the function in human …

Binary code summarization: Benchmarking chatgpt/gpt-4 and other large language models

X Jin, J Larson, W Yang, Z Lin - arXiv preprint arXiv:2312.09601, 2023 - arxiv.org
Binary code summarization, while invaluable for understanding code semantics, is
challenging due to its labor-intensive nature. This study delves into the potential of large …

" Get in Researchers; We're Measuring Reproducibility": A Reproducibility Study of Machine Learning Papers in Tier 1 Security Conferences

D Olszewski, A Lu, C Stillman, K Warren… - Proceedings of the …, 2023 - dl.acm.org
Reproducibility is crucial to the advancement of science; it strengthens confidence in
seemingly contradictory results and expands the boundaries of known discoveries …

A survey on large language models for software engineering

Q Zhang, C Fang, Y Xie, Y Zhang, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Software Engineering (SE) is the systematic design, development, and maintenance of
software applications, underpinning the digital infrastructure of our modern mainworld. Very …

[PDF][PDF] len or index or count, anything but v1”: Predicting variable names in decompilation output with transfer learning

KK Pal, AP Bajaj, P Banerjee, A Dutcher… - 2024 IEEE Symposium …, 2024 - yancomm.net
Binary reverse engineering is an arduous and tedious task performed by skilled and
expensive human analysts. Information about the source code is irrevocably lost in the …

Refining decompiled c code with large language models

WK Wong, H Wang, Z Li, Z Liu, S Wang, Q Tang… - arXiv preprint arXiv …, 2023 - arxiv.org
AC decompiler converts an executable into source code. The recovered C source code,
once re-compiled, is expected to produce an executable with the same functionality as the …

Nova: Generative Language Models for Binaries

N Jiang, C Wang, K Liu, X Xu, L Tan… - arXiv preprint arXiv …, 2023 - arxiv.org
Generative large language models (LLMs) pre-trained on code have shown impressive
effectiveness in code generation, program repair, and document analysis. However, existing …

[PDF][PDF] Large language models for code analysis: Do llms really do their job?

C Fang, N Miao, S Srivastav, J Liu, R Zhang… - arXiv preprint arXiv …, 2023 - usenix.org
Large language models (LLMs) have demonstrated significant potential in the realm of
natural language understanding and programming code processing tasks. Their capacity to …

Decomperson: How humans decompile and what we can learn from it

K Burk, F Pagani, C Kruegel, G Vigna - 31st USENIX Security …, 2022 - usenix.org
Human analysts must reverse engineer binary programs as a prerequisite for a number of
security tasks, such as vulnerability analysis, malware detection, and firmware re-hosting …

A transformer-based function symbol name inference model from an assembly language for binary reversing

H Kim, J Bak, K Cho, H Koo - Proceedings of the 2023 ACM Asia …, 2023 - dl.acm.org
Reverse engineering of a stripped binary has a wide range of applications, yet it is
challenging mainly due to the lack of contextually useful information within. Once debugging …