Learning Transfers over Several Programming Languages- 学术资源搜索

Learning Transfers over Several Programming Languages

R Baltaji, S Pujar, L Mandel, M Hirzel, L Buratti… - arXiv preprint arXiv …, 2023 - arxiv.org

R Baltaji, S Pujar, L Mandel, M Hirzel, L Buratti, L Varshney

arXiv preprint arXiv:2310.16937, 2023•arxiv.org

Large language models (LLMs) have recently become remarkably good at improving developer productivity for high-resource programming languages. These models use two kinds of data: large amounts of unlabeled code samples for pretraining and relatively smaller amounts of labeled code samples for fine-tuning or in-context learning. Unfortunately, many programming languages are low-resource, lacking labeled samples for most tasks and often even lacking unlabeled samples. Therefore, users of low-resource languages (e.g., legacy or new languages) miss out on the benefits of LLMs. Cross-lingual transfer learning uses data from a source language to improve model performance on a target language. It has been well-studied for natural languages, but has received little attention for programming languages. This paper reports extensive experiments on four tasks using a transformer-based LLM and 11 to 41 programming languages to explore the following questions. First, how well cross-lingual transfer works for a given task across different language pairs. Second, given a task and target language, how to best choose a source language. Third, the characteristics of a language pair that are predictive of transfer performance, and fourth, how that depends on the given task.

arxiv.org

展开收起

被引用次数：2 相关文章所有 3 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果