Starcoder: may the source be with you!

R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov… - arXiv preprint arXiv …, 2023 - arxiv.org
The BigCode community, an open-scientific collaboration working on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder and …

Multi-step jailbreaking privacy attacks on chatgpt

H Li, D Guo, W Fan, M Xu, J Huang, F Meng… - arXiv preprint arXiv …, 2023 - arxiv.org
With the rapid progress of large language models (LLMs), many downstream NLP tasks can
be well solved given appropriate prompts. Though model developers and researchers work …

Evaluating the social impact of generative ai systems in systems and society

I Solaiman, Z Talat, W Agnew, L Ahmad… - arXiv preprint arXiv …, 2023 - arxiv.org
Generative AI systems across modalities, ranging from text (including code), image, audio,
and video, have broad social impacts, but there is no official standard for means of …

Starcoder 2 and the stack v2: The next generation

A Lozhkov, R Li, LB Allal, F Cassano… - arXiv preprint arXiv …, 2024 - arxiv.org
The BigCode project, an open-scientific collaboration focused on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In …

A survey of large language models attribution

D Li, Z Sun, X Hu, Z Liu, Z Chen, B Hu, A Wu… - arXiv preprint arXiv …, 2023 - arxiv.org
Open-domain generative systems have gained significant attention in the field of
conversational AI (eg, generative search engines). This paper presents a comprehensive …

Embers of autoregression: Understanding large language models through the problem they are trained to solve

RT McCoy, S Yao, D Friedman, M Hardy… - arXiv preprint arXiv …, 2023 - arxiv.org
The widespread adoption of large language models (LLMs) makes it important to recognize
their strengths and limitations. We argue that in order to develop a holistic understanding of …

[HTML][HTML] An archival perspective on pretraining data

MA Desai, IV Pasquetto, AZ Jacobs, D Card - Patterns, 2024 - cell.com
Alongside an explosion in research and development related to large language models,
there has been a concomitant rise in the creation of pretraining datasets—massive …

Leak, cheat, repeat: Data contamination and evaluation malpractices in closed-source llms

S Balloccu, P Schmidtová, M Lango… - arXiv preprint arXiv …, 2024 - arxiv.org
Natural Language Processing (NLP) research is increasingly focusing on the use of Large
Language Models (LLMs), with some of the most popular ones being either fully or partially …

Investigating data contamination in modern benchmarks for large language models

C Deng, Y Zhao, X Tang, M Gerstein… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent observations have underscored a disparity between the inflated benchmark scores
and the actual performance of LLMs, raising concerns about potential contamination of …

Towards AI Accountability Infrastructure: Gaps and Opportunities in AI Audit Tooling

V Ojewale, R Steed, B Vecchione, A Birhane… - arXiv preprint arXiv …, 2024 - arxiv.org
Audits are critical mechanisms for identifying the risks and limitations of deployed artificial
intelligence (AI) systems. However, the effective execution of AI audits remains incredibly …