Knowledge mechanisms in large language models: A survey and perspective
Understanding knowledge mechanisms in Large Language Models (LLMs) is crucial for
advancing towards trustworthy AGI. This paper reviews knowledge mechanism analysis …
advancing towards trustworthy AGI. This paper reviews knowledge mechanism analysis …
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions
The remarkable performance of large language models (LLMs) in content generation,
coding, and common-sense reasoning has spurred widespread integration into many facets …
coding, and common-sense reasoning has spurred widespread integration into many facets …
Shared Imagination: LLMs Hallucinate Alike
Despite the recent proliferation of large language models (LLMs), their training recipes--
model architecture, pre-training data and optimization algorithm--are often very similar. This …
model architecture, pre-training data and optimization algorithm--are often very similar. This …
Unpacking sdxl turbo: Interpreting text-to-image models with sparse autoencoders
Sparse autoencoders (SAEs) have become a core ingredient in the reverse engineering of
large-language models (LLMs). For LLMs, they have been shown to decompose …
large-language models (LLMs). For LLMs, they have been shown to decompose …
System 2 reasoning capabilities are nigh
SC Lowe - arXiv preprint arXiv:2410.03662, 2024 - arxiv.org
In recent years, machine learning models have made strides towards human-like reasoning
capabilities from several directions. In this work, we review the current state of the literature …
capabilities from several directions. In this work, we review the current state of the literature …
Augmenting the Interpretability of GraphCodeBERT for Code Similarity Tasks
J Martinez-Gil - arXiv preprint arXiv:2410.05275, 2024 - arxiv.org
Assessing the degree of similarity of code fragments is crucial for ensuring software quality,
but it remains challenging due to the need to capture the deeper semantic aspects of code …
but it remains challenging due to the need to capture the deeper semantic aspects of code …
[HTML][HTML] On the role of knowledge graphs in AI-based scientific discovery
M D'aquin - Journal of Web Semantics, 2024 - Elsevier
Research and the scientific activity are widely seen as an area where the current trends in
AI, namely the development of deep learning models (including large language models), are …
AI, namely the development of deep learning models (including large language models), are …
ElastiFormer: Learned Redundancy Reduction in Transformer via Self-Distillation
We introduce ElastiFormer, a post-training technique that adapts pretrained Transformer
models into an elastic counterpart with variable inference time compute. ElastiFormer …
models into an elastic counterpart with variable inference time compute. ElastiFormer …
Interpretable Language Modeling via Induction-head Ngram Models
Recent large language models (LLMs) have excelled across a wide range of tasks, but their
use in high-stakes and compute-limited settings has intensified the demand for …
use in high-stakes and compute-limited settings has intensified the demand for …
CogSteer: Cognition-Inspired Selective Layer Intervention for Efficient Semantic Steering in Large Language Models
Despite their impressive capabilities, large language models (LLMs) often lack
interpretability and can generate toxic content. While using LLMs as foundation models and …
interpretability and can generate toxic content. While using LLMs as foundation models and …