Large language models for software engineering: A systematic literature review

X Hou, Y Zhao, Y Liu, Z Yang, K Wang, L Li… - ACM Transactions on …, 2024 - dl.acm.org
Large Language Models (LLMs) have significantly impacted numerous domains, including
Software Engineering (SE). Many recent publications have explored LLMs applied to …

Octopack: Instruction tuning code large language models

N Muennighoff, Q Liu, A Zebaze, Q Zheng… - arXiv preprint arXiv …, 2023 - arxiv.org
Finetuning large language models (LLMs) on instructions leads to vast performance
improvements on natural language tasks. We apply instruction tuning using code …

A survey of large language models for code: Evolution, benchmarking, and future trends

Z Zheng, K Ning, Y Wang, J Zhang, D Zheng… - arXiv preprint arXiv …, 2023 - arxiv.org
General large language models (LLMs), represented by ChatGPT, have demonstrated
significant potential in tasks such as code generation in software engineering. This has led …

The formai dataset: Generative ai in software security through the lens of formal verification

N Tihanyi, T Bisztray, R Jain, MA Ferrag… - Proceedings of the 19th …, 2023 - dl.acm.org
This paper presents the FormAI dataset, a large collection of 112,000 AI-generated
compilable and independent C programs with vulnerability classification. We introduce a …

BioCoder: a benchmark for bioinformatics code generation with contextual pragmatic knowledge

X Tang, B Qian, R Gao, J Chen, X Chen… - arXiv preprint arXiv …, 2023 - arxiv.org
Pre-trained language models like ChatGPT have significantly improved code generation. As
these models scale up, there is an increasing need for the output to handle more intricate …

A survey on large language models for software engineering

Q Zhang, C Fang, Y Xie, Y Zhang, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Software Engineering (SE) is the systematic design, development, and maintenance of
software applications, underpinning the digital infrastructure of our modern mainworld. Very …

Repohyper: Better context retrieval is all you need for repository-level code completion

HN Phan, HN Phan, TN Nguyen, NDQ Bui - arXiv preprint arXiv …, 2024 - arxiv.org
Code Large Language Models (CodeLLMs) have demonstrated impressive proficiency in
code completion tasks. However, they often fall short of fully understanding the extensive …

Ratgpt: Turning online llms into proxies for malware attacks

M Beckerich, L Plein, S Coronado - arXiv preprint arXiv:2308.09183, 2023 - arxiv.org
The evolution of Generative AI and the capabilities of the newly released Large Language
Models (LLMs) open new opportunities in software engineering. However, they also lead to …

LLMs in Web Development: Evaluating LLM-Generated PHP Code Unveiling Vulnerabilities and Limitations

R Tóth, T Bisztray, L Erdődi - … on Computer Safety, Reliability, and Security, 2024 - Springer
This study evaluates the security of web application code generated by Large Language
Models, analyzing 2,500 GPT-4 generated PHP websites. These were deployed in Docker …

BioCoder: a benchmark for bioinformatics code generation with large language models

X Tang, B Qian, R Gao, J Chen, X Chen… - …, 2024 - academic.oup.com
Pretrained large language models (LLMs) have significantly improved code generation. As
these models scale up, there is an increasing need for the output to handle more intricate …