[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

KS Kalyan - Natural Language Processing Journal, 2023 - Elsevier
Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …

Survey of vulnerabilities in large language models revealed by adversarial attacks

E Shayegani, MAA Mamun, Y Fu, P Zaree… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) are swiftly advancing in architecture and capability, and as
they integrate more deeply into complex systems, the urgency to scrutinize their security …

Gpt-4 technical report

J Achiam, S Adler, S Agarwal, L Ahmad… - arXiv preprint arXiv …, 2023 - arxiv.org
We report the development of GPT-4, a large-scale, multimodal model which can accept
image and text inputs and produce text outputs. While less capable than humans in many …

Multi-step jailbreaking privacy attacks on chatgpt

H Li, D Guo, W Fan, M Xu, J Huang, F Meng… - arXiv preprint arXiv …, 2023 - arxiv.org
With the rapid progress of large language models (LLMs), many downstream NLP tasks can
be well solved given appropriate prompts. Though model developers and researchers work …

" do anything now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models

X Shen, Z Chen, M Backes, Y Shen… - arXiv preprint arXiv …, 2023 - arxiv.org
The misuse of large language models (LLMs) has garnered significant attention from the
general public and LLM vendors. In response, efforts have been made to align LLMs with …

Exploiting programmatic behavior of llms: Dual-use through standard security attacks

D Kang, X Li, I Stoica, C Guestrin… - 2024 IEEE Security …, 2024 - ieeexplore.ieee.org
Recent advances in instruction-following large language models (LLMs) have led to
dramatic improvements in a range of NLP tasks. Unfortunately, we find that the same …

Ignore previous prompt: Attack techniques for language models

F Perez, I Ribeiro - arXiv preprint arXiv:2211.09527, 2022 - arxiv.org
Transformer-based large language models (LLMs) provide a powerful foundation for natural
language tasks in large-scale customer-facing applications. However, studies that explore …

Gptfuzzer: Red teaming large language models with auto-generated jailbreak prompts

J Yu, X Lin, X Xing - arXiv preprint arXiv:2309.10253, 2023 - arxiv.org
Large language models (LLMs) have recently experienced tremendous popularity and are
widely used from casual conversations to AI-driven programming. However, despite their …

Multilingual jailbreak challenges in large language models

Y Deng, W Zhang, SJ Pan, L Bing - arXiv preprint arXiv:2310.06474, 2023 - arxiv.org
While large language models (LLMs) exhibit remarkable capabilities across a wide range of
tasks, they pose potential safety concerns, such as the``jailbreak''problem, wherein …

Llama guard: Llm-based input-output safeguard for human-ai conversations

H Inan, K Upasani, J Chi, R Rungta, K Iyer… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce Llama Guard, an LLM-based input-output safeguard model geared towards
Human-AI conversation use cases. Our model incorporates a safety risk taxonomy, a …