Attentionviz: A global view of transformer attention

C Yeh, Y Chen, A Wu, C Chen, F Viégas… - … on Visualization and …, 2023 - ieeexplore.ieee.org
Transformer models are revolutionizing machine learning, but their inner workings remain
mysterious. In this work, we present a new visualization technique designed to help …

Explainability of Vision Transformers: A Comprehensive Review and New Perspectives

R Kashefi, L Barekatain, M Sabokrou… - arXiv preprint arXiv …, 2023 - arxiv.org
Transformers have had a significant impact on natural language processing and have
recently demonstrated their potential in computer vision. They have shown promising results …

Hierarchical local-global transformer for temporal sentence grounding

X Fang, D Liu, P Zhou, Z Xu, R Li - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
This article studies the multimedia problem of temporal sentence grounding (TSG), which
aims to accurately determine the specific video segment in an untrimmed video according to …

Explanatory models in neuroscience, Part 1: Taking mechanistic abstraction seriously

R Cao, D Yamins - Cognitive Systems Research, 2024 - Elsevier
Despite the recent success of neural network models in mimicking animal performance on
various tasks, critics worry that these models fail to illuminate brain function. We take it that a …

LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models

GBM Stan, RY Rohekar, Y Gurwicz, ML Olson… - arXiv preprint arXiv …, 2024 - arxiv.org
In the rapidly evolving landscape of artificial intelligence, multi-modal large language
models are emerging as a significant area of interest. These models, which combine various …

Signet: A siamese graph convolutional network for multi-class urban change detection

Y Zhou, J Wang, J Ding, B Liu, N Weng, H Xiao - Remote Sensing, 2023 - mdpi.com
Detecting changes in urban areas presents many challenges, including complex features,
fast-changing rates, and human-induced interference. At present, most of the research on …

Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs

EE Hwang, D Chen, Y Han, L Jia, J Shan - Bioengineering, 2023 - mdpi.com
Glaucomatous optic neuropathy (GON) can be diagnosed and monitored using fundus
photography, a widely available and low-cost approach already adopted for automated …

Recasting Generic Pretrained Vision Transformers As Object-Centric Scene Encoders For Manipulation Policies

J Qian, A Panagopoulos, D Jayaraman - arXiv preprint arXiv:2405.15916, 2024 - arxiv.org
Generic re-usable pre-trained image representation encoders have become a standard
component of methods for many computer vision tasks. As visual representations for robots …

Universal Deoxidation of Semiconductor Substrates Assisted by Machine Learning and Real-Time Feedback Control

C Shen, W Zhan, J Tang, Z Wu, B Xu… - … Applied Materials & …, 2024 - ACS Publications
Substrate oxidation is inevitable when exposed to ambient atmosphere during
semiconductor manufacturing, which is detrimental to the fabrication of state-of-the-art …

GRACE: Unveiling Gene Regulatory Networks With Causal Mechanistic Graph Neural Networks in Single-Cell RNA-Sequencing Data

JC Wang, YJ Chen, Q Zou - IEEE Transactions on Neural …, 2024 - ieeexplore.ieee.org
Reconstructing gene regulatory networks (GRNs) using single-cell RNA sequencing (scRNA-
seq) data holds great promise for unraveling cellular fate development and heterogeneity …