A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt

Y Cao, S Li, Y Liu, Z Yan, Y Dai, PS Yu… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, ChatGPT, along with DALL-E-2 and Codex, has been gaining significant attention
from society. As a result, many individuals have become interested in related resources and …

[HTML][HTML] AI deception: A survey of examples, risks, and potential solutions

PS Park, S Goldstein, A O'Gara, M Chen, D Hendrycks - Patterns, 2024 - cell.com
This paper argues that a range of current AI systems have learned how to deceive humans.
We define deception as the systematic inducement of false beliefs in the pursuit of some …

Gpt-4 technical report

J Achiam, S Adler, S Agarwal, L Ahmad… - arXiv preprint arXiv …, 2023 - arxiv.org
We report the development of GPT-4, a large-scale, multimodal model which can accept
image and text inputs and produce text outputs. While less capable than humans in many …

Taxonomy of risks posed by language models

L Weidinger, J Uesato, M Rauh, C Griffin… - Proceedings of the …, 2022 - dl.acm.org
Responsible innovation on large-scale Language Models (LMs) requires foresight into and
in-depth understanding of the risks these models may pose. This paper develops a …

Generative language models and automated influence operations: Emerging threats and potential mitigations

JA Goldstein, G Sastry, M Musser, R DiResta… - arXiv preprint arXiv …, 2023 - arxiv.org
Generative language models have improved drastically, and can now produce realistic text
outputs that are difficult to distinguish from human-written content. For malicious actors …

Webgpt: Browser-assisted question-answering with human feedback

R Nakano, J Hilton, S Balaji, J Wu, L Ouyang… - arXiv preprint arXiv …, 2021 - arxiv.org
We fine-tune GPT-3 to answer long-form questions using a text-based web-browsing
environment, which allows the model to search and navigate the web. By setting up the task …

Truthfulqa: Measuring how models mimic human falsehoods

S Lin, J Hilton, O Evans - arXiv preprint arXiv:2109.07958, 2021 - arxiv.org
We propose a benchmark to measure whether a language model is truthful in generating
answers to questions. The benchmark comprises 817 questions that span 38 categories …

Critic: Large language models can self-correct with tool-interactive critiquing

Z Gou, Z Shao, Y Gong, Y Shen, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent developments in large language models (LLMs) have been impressive. However,
these models sometimes show inconsistencies and problematic behavior, such as …

Weak-to-strong generalization: Eliciting strong capabilities with weak supervision

C Burns, P Izmailov, JH Kirchner, B Baker… - arXiv preprint arXiv …, 2023 - arxiv.org
Widely used alignment techniques, such as reinforcement learning from human feedback
(RLHF), rely on the ability of humans to supervise model behavior-for example, to evaluate …

Discovering latent knowledge in language models without supervision

C Burns, H Ye, D Klein, J Steinhardt - arXiv preprint arXiv:2212.03827, 2022 - arxiv.org
Existing techniques for training language models can be misaligned with the truth: if we train
models with imitation learning, they may reproduce errors that humans make; if we train …