[HTML][HTML] Decoding ChatGPT: a taxonomy of existing research, current challenges, and possible future directions
Abstract Chat Generative Pre-trained Transformer (ChatGPT) has gained significant interest
and attention since its launch in November 2022. It has shown impressive performance in …
and attention since its launch in November 2022. It has shown impressive performance in …
Explainable artificial intelligence for autonomous driving: A comprehensive overview and field guide for future research directions
Autonomous driving has achieved significant milestones in research and development over
the last two decades. There is increasing interest in the field as the deployment of …
the last two decades. There is increasing interest in the field as the deployment of …
Self-rewarding language models
We posit that to achieve superhuman agents, future models require superhuman feedback
in order to provide an adequate training signal. Current approaches commonly train reward …
in order to provide an adequate training signal. Current approaches commonly train reward …
Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks
The exponential growth of large language models (LLMs) has opened up numerous
possibilities for multi-modal AGI systems. However the progress in vision and vision …
possibilities for multi-modal AGI systems. However the progress in vision and vision …
Detecting and preventing hallucinations in large vision language models
Instruction tuned Large Vision Language Models (LVLMs) have significantly advanced in
generalizing across a diverse set of multi-modal tasks, especially for Visual Question …
generalizing across a diverse set of multi-modal tasks, especially for Visual Question …
Mm1: Methods, analysis & insights from multimodal llm pre-training
In this work, we discuss building performant Multimodal Large Language Models (MLLMs).
In particular, we study the importance of various architecture components and data choices …
In particular, we study the importance of various architecture components and data choices …
Clever hans or neural theory of mind? stress testing social reasoning in large language models
The escalating debate on AI's capabilities warrants developing reliable metrics to assess
machine" intelligence". Recently, many anecdotal examples were used to suggest that …
machine" intelligence". Recently, many anecdotal examples were used to suggest that …
How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites
In this report, we introduce InternVL 1.5, an open-source multimodal large language model
(MLLM) to bridge the capability gap between open-source and proprietary commercial …
(MLLM) to bridge the capability gap between open-source and proprietary commercial …
Beyond deep reinforcement learning: A tutorial on generative diffusion models in network optimization
Generative Diffusion Models (GDMs) have emerged as a transformative force in the realm of
Generative Artificial Intelligence (GAI), demonstrating their versatility and efficacy across a …
Generative Artificial Intelligence (GAI), demonstrating their versatility and efficacy across a …
Dilu: A knowledge-driven approach to autonomous driving with large language models
Recent advancements in autonomous driving have relied on data-driven approaches, which
are widely adopted but face challenges including dataset bias, overfitting, and …
are widely adopted but face challenges including dataset bias, overfitting, and …