Multi-layer transformers gradient can be approximated in almost linear time

Y Liang, Z Sha, Z Shi, Z Song, Y Zhou - arXiv preprint arXiv:2408.13233, 2024 - arxiv.org
The computational complexity of the self-attention mechanism in popular transformer
architectures poses significant challenges for training and inference, and becomes the …

In-context learning may not elicit trustworthy reasoning: A-not-b errors in pretrained language models

P Han, P Song, H Yu, J You - arXiv preprint arXiv:2409.15454, 2024 - arxiv.org
Recent advancements in artificial intelligence have led to the creation of highly capable
large language models (LLMs) that can perform tasks in a human-like manner. However …

Generating action-conditioned prompts for open-vocabulary video action recognition

C Jia, M Luo, X Chang, Z Dang, M Han… - Proceedings of the …, 2024 - dl.acm.org
Exploring open-vocabulary video action recognition is a promising venture, which aims to
recognize previously unseen actions within any arbitrary set of categories. Existing methods …

Circuit Complexity Bounds for RoPE-based Transformer Architecture

B Chen, X Li, Y Liang, J Long, Z Shi, Z Song - arXiv preprint arXiv …, 2024 - arxiv.org
Characterizing the express power of the Transformer architecture is critical to understanding
its capacity limits and scaling law. Recent works provide the circuit complexity bounds to …

Towards Friendly AI: A Comprehensive Review and New Perspectives on Human-AI Alignment

Q Sun, Y Li, E Alturki, SMK Murthy… - arXiv preprint arXiv …, 2024 - arxiv.org
As Artificial Intelligence (AI) continues to advance rapidly, Friendly AI (FAI) has been
proposed to advocate for more equitable and fair development of AI. Despite its importance …

MageBench: Bridging Large Multimodal Models to Agents

M Zhang, Q Dai, Y Yang, J Bao, D Chen, K Qiu… - arXiv preprint arXiv …, 2024 - arxiv.org
LMMs have shown impressive visual understanding capabilities, with the potential to be
applied in agents, which demand strong reasoning and planning abilities. Nevertheless …

VIVA: A Benchmark for Vision-Grounded Decision-Making with Human Values

Z Hu, Y Ren, J Li, Y Yin - arXiv preprint arXiv:2407.03000, 2024 - arxiv.org
Large vision language models (VLMs) have demonstrated significant potential for
integration into daily life, making it crucial for them to incorporate human values when …

Beyond the Binary: Capturing Diverse Preferences With Reward Regularization

V Padmakumar, C Jin, HR Kirk, H He - arXiv preprint arXiv:2412.03822, 2024 - arxiv.org
Large language models (LLMs) are increasingly deployed via public-facing interfaces to
interact with millions of users, each with diverse preferences. Despite this, preference tuning …

CityBench: Evaluating the Capabilities of Large Language Model as World Model

J Feng, J Zhang, J Yan, X Zhang, T Ouyang… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) with powerful generalization ability has been widely used in
many domains. A systematic and reliable evaluation of LLMs is a crucial step in their …

The EPOCH of AI: Human-Machine Complementarities at Work

I Loaiza, R Rigobon - Available at SSRN 5028371, 2024 - papers.ssrn.com
In this paper, we study the impact of AI and emerging technologies on the American labor
force by exploring AI's potential for substitution and complementarity with human workers …