A systematic survey and critical review on evaluating large language models: Challenges, limitations, and recommendations
Abstract Large Language Models (LLMs) have recently gained significant attention due to
their remarkable capabilities in performing diverse tasks across various domains. However …
their remarkable capabilities in performing diverse tasks across various domains. However …
A3-CodGen: A Repository-Level Code Generation Framework for Code Reuse with Local-Aware, Global-Aware, and Third-Party-Library-Aware
LLM-based code generation tools are essential to help developers in the software
development process. Existing tools often disconnect with the working context, ie, the code …
development process. Existing tools often disconnect with the working context, ie, the code …
A3-CodGen : A Repository-Level Code Generation Framework for Code Reuse with Local-Aware, Global-Aware, and Third-Party-Library-Aware
LLM-based code generation tools are essential to help developers in the software
development process. Existing tools often disconnect with the working context, ie, the code …
development process. Existing tools often disconnect with the working context, ie, the code …
Morescient GAI for software engineering
M Kessel, C Atkinson - ACM Transactions on Software Engineering and …, 2024 - dl.acm.org
The ability of Generative AI (GAI) technology to automatically check, synthesize and modify
software engineering artifacts promises to revolutionize all aspects of software engineering …
software engineering artifacts promises to revolutionize all aspects of software engineering …
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs
Recent advancements in Code Large Language Models (CodeLLMs) have predominantly
focused on open-ended code generation tasks, often neglecting the critical aspect of code …
focused on open-ended code generation tasks, often neglecting the critical aspect of code …
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Efficiency, specialization, and adaptability to new data distributions are qualities that are
hard to combine in current Large Language Models. The Mixture of Experts (MoE) …
hard to combine in current Large Language Models. The Mixture of Experts (MoE) …
Outcome-Refining Process Supervision for Code Generation
Large Language Models have demonstrated remarkable capabilities in code generation, yet
they often struggle with complex programming tasks that require deep algorithmic …
they often struggle with complex programming tasks that require deep algorithmic …
RepairBench: Leaderboard of Frontier Models for Program Repair
A Silva, M Monperrus - arXiv preprint arXiv:2409.18952, 2024 - arxiv.org
AI-driven program repair uses AI models to repair buggy software by producing patches.
Rapid advancements in AI surely impact state-of-the-art performance of program repair. Yet …
Rapid advancements in AI surely impact state-of-the-art performance of program repair. Yet …
CODECLEANER: Elevating Standards with A Robust Data Contamination Mitigation Toolkit
Data contamination presents a critical barrier preventing widespread industrial adoption of
advanced software engineering techniques that leverage code language models (CLMs) …
advanced software engineering techniques that leverage code language models (CLMs) …
If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs
Model merging has shown great promise at combining expert models, but the benefit of
merging is unclear when merging``generalist''models trained on many tasks. We explore …
merging is unclear when merging``generalist''models trained on many tasks. We explore …