Mapping language to code in programmatic context

[HTML][HTML] Natural language generation and understanding of big code for AI-assisted programming: A review

MF Wong, S Guo, CN Hang, SW Ho, CW Tan - Entropy, 2023 - mdpi.com

This paper provides a comprehensive review of the literature concerning the utilization of
Natural Language Processing (NLP) techniques, with a particular focus on transformer …

被引用次数：55 相关文章所有 9 个版本

[PDF] neurips.cc

Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation

J Liu, CS Xia, Y Wang, L Zhang - Advances in Neural …, 2024 - proceedings.neurips.cc

Program synthesis has been long studied with recent approaches focused on directly using
the power of Large Language Models (LLMs) to generate code. Programming benchmarks …

被引用次数：435 相关文章所有 6 个版本

[PDF] arxiv.org

Codegen: An open large language model for code with multi-turn program synthesis

E Nijkamp, B Pang, H Hayashi, L Tu, H Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

Program synthesis strives to generate a computer program as a solution to a given problem
specification, expressed with input-output examples or natural language descriptions. The …

被引用次数：790 相关文章所有 3 个版本

[PDF] arxiv.org

Unixcoder: Unified cross-modal pre-training for code representation

D Guo, S Lu, N Duan, Y Wang, M Zhou… - arXiv preprint arXiv …, 2022 - arxiv.org

Pre-trained models for programming languages have recently demonstrated great success
on code intelligence. To support both code-related understanding and generation tasks …

被引用次数：475 相关文章所有 4 个版本

[PDF] arxiv.org

Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation

Y Wang, W Wang, S Joty, SCH Hoi - arXiv preprint arXiv:2109.00859, 2021 - arxiv.org

Pre-trained models for Natural Languages (NL) like BERT and GPT have been recently
shown to transfer well to Programming Languages (PL) and largely benefit a broad set of …

被引用次数：1271 相关文章所有 7 个版本

[PDF] arxiv.org

Program synthesis with large language models

J Austin, A Odena, M Nye, M Bosma… - arXiv preprint arXiv …, 2021 - arxiv.org

This paper explores the limits of the current generation of large language models for
program synthesis in general purpose programming languages. We evaluate a collection of …

被引用次数：1008 相关文章所有 3 个版本

[PDF] arxiv.org

SantaCoder: don't reach for the stars!

LB Allal, R Li, D Kocetkov, C Mou, C Akiki… - arXiv preprint arXiv …, 2023 - arxiv.org

The BigCode project is an open-scientific collaboration working on the responsible
development of large language models for code. This tech report describes the progress of …

被引用次数：175 相关文章所有 8 个版本

[PDF] arxiv.org

Unified pre-training for program understanding and generation

WU Ahmad, S Chakraborty, B Ray… - arXiv preprint arXiv …, 2021 - arxiv.org

Code summarization and generation empower conversion between programming language
(PL) and natural language (NL), while code translation avails the migration of legacy code …

被引用次数：708 相关文章所有 7 个版本

[PDF] arxiv.org

Codexglue: A machine learning benchmark dataset for code understanding and generation

S Lu, D Guo, S Ren, J Huang, A Svyatkovskiy… - arXiv preprint arXiv …, 2021 - arxiv.org

Benchmark datasets have a significant impact on accelerating research in programming
language tasks. In this paper, we introduce CodeXGLUE, a benchmark dataset to foster …

被引用次数：730 相关文章所有 5 个版本

[PDF] arxiv.org

Measuring coding challenge competence with apps

D Hendrycks, S Basart, S Kadavath, M Mazeika… - arXiv preprint arXiv …, 2021 - arxiv.org

While programming is one of the most broadly applicable skills in modern society, modern
machine learning models still cannot code solutions to basic problems. Despite its …

被引用次数：422 相关文章所有 8 个版本