Challenges and applications of large language models
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
Survey of vulnerabilities in large language models revealed by adversarial attacks
Large Language Models (LLMs) are swiftly advancing in architecture and capability, and as
they integrate more deeply into complex systems, the urgency to scrutinize their security …
they integrate more deeply into complex systems, the urgency to scrutinize their security …
Weak-to-strong generalization: Eliciting strong capabilities with weak supervision
Widely used alignment techniques, such as reinforcement learning from human feedback
(RLHF), rely on the ability of humans to supervise model behavior-for example, to evaluate …
(RLHF), rely on the ability of humans to supervise model behavior-for example, to evaluate …
Foundational challenges in assuring alignment and safety of large language models
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …
language models (LLMs). These challenges are organized into three different categories …
Frontier AI regulation: Managing emerging risks to public safety
Advanced AI models hold the promise of tremendous benefits for humanity, but society
needs to proactively manage the accompanying risks. In this paper, we focus on what we …
needs to proactively manage the accompanying risks. In this paper, we focus on what we …
Reasoning or reciting? exploring the capabilities and limitations of language models through counterfactual tasks
The impressive performance of recent language models across a wide range of tasks
suggests that they possess a degree of abstract reasoning skills. Are these skills general …
suggests that they possess a degree of abstract reasoning skills. Are these skills general …
Do llms exhibit human-like response biases? a case study in survey design
One widely cited barrier to the adoption of LLMs as proxies for humans in subjective tasks is
their sensitivity to prompt wording—but interestingly, humans also display sensitivities to …
their sensitivity to prompt wording—but interestingly, humans also display sensitivities to …
[PDF][PDF] Machine psychology: Investigating emergent capabilities and behavior in large language models using psychological methods
T Hagendorff - arXiv preprint arXiv:2303.13988, 2023 - cybershafarat.com
Large language models (LLMs) are currently at the forefront of intertwining AI systems with
human communication and everyday life. Due to rapid technological advances and their …
human communication and everyday life. Due to rapid technological advances and their …
Embers of autoregression: Understanding large language models through the problem they are trained to solve
The widespread adoption of large language models (LLMs) makes it important to recognize
their strengths and limitations. We argue that in order to develop a holistic understanding of …
their strengths and limitations. We argue that in order to develop a holistic understanding of …
Moca: Measuring human-language model alignment on causal and moral judgment tasks
Human commonsense understanding of the physical and social world is organized around
intuitive theories. These theories support making causal and moral judgments. When …
intuitive theories. These theories support making causal and moral judgments. When …