Investigating saturation effects in integrated gradients

H Zhao, H Chen, F Yang, N Liu, H Deng, H Cai… - ACM Transactions on …, 2024 - dl.acm.org

Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …

被引用次数：242 相关文章所有 5 个版本

[PDF] thecvf.com

Guided integrated gradients: An adaptive path method for removing noise

A Kapishnikov, S Venugopalan… - Proceedings of the …, 2021 - openaccess.thecvf.com

Integrated Gradients (IG) is a commonly used feature attribution method for deep neural
networks. While IG has many desirable properties, the method often produces …

被引用次数：96 相关文章所有 8 个版本

[PDF] arxiv.org

Discretized integrated gradients for explaining language models

S Sanyal, X Ren - arXiv preprint arXiv:2108.13654, 2021 - arxiv.org

As a prominent attribution-based explanation algorithm, Integrated Gradients (IG) is widely
adopted due to its desirable explanation axioms and the ease of gradient computation. It …

被引用次数：44 相关文章所有 4 个版本

[PDF] arxiv.org

Gradient based Feature Attribution in Explainable AI: A Technical Review

Y Wang, T Zhang, X Guo, Z Shen - arXiv preprint arXiv:2403.10415, 2024 - arxiv.org

The surge in black-box AI models has prompted the need to explain the internal mechanism
and justify their reliability, especially in high-stakes applications, such as healthcare and …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Focus! rating xai methods and finding biases

A Arias-Duart, F Parés, D Garcia-Gasulla… - … on Fuzzy Systems …, 2022 - ieeexplore.ieee.org

AI explainability improves the transparency and trustworthiness of models. However, in the
domain of images, where deep learning has succeeded the most, explainability is still poorly …

被引用次数：31 相关文章所有 6 个版本

[PDF] github.io

[PDF][PDF] Counterfactual Interpolation Augmentation (CIA): A Unified Approach to Enhance Fairness and Explainability of DNN.

Y Qiang, C Li, M Brocanelli, D Zhu - IJCAI, 2022 - qiangyao1988.github.io

Bias in the training data can jeopardize fairness and explainability of deep neural network
prediction on test data. We propose a novel bias-tailored data augmentation approach …

被引用次数：17 相关文章所有 4 个版本

[PDF] aaai.org

Shaping noise for robust attributions in neural stochastic differential equations

SK Jha, R Ewetz, A Velasquez… - Proceedings of the …, 2022 - ojs.aaai.org

Abstract Neural SDEs with Brownian motion as noise lead to smoother attributions than
traditional ResNets. Various attribution methods such as saliency maps, integrated …

被引用次数：11 相关文章所有 6 个版本

[PDF] acs.org

Pixelated High-Q Metasurfaces for in Situ Biospectroscopy and Artificial Intelligence-Enabled Classification of Lipid Membrane Photoswitching Dynamics

M Barkey, R Büchner, A Wester, SD Pritzl… - ACS …, 2024 - ACS Publications

Nanophotonic devices excel at confining light into intense hot spots of electromagnetic near
fields, creating exceptional opportunities for light–matter coupling and surface-enhanced …

被引用次数：2 相关文章所有 8 个版本

[PDF] arxiv.org

Negative flux aggregation to estimate feature attributions

X Li, D Pan, C Li, Y Qiang, D Zhu - arXiv preprint arXiv:2301.06989, 2023 - arxiv.org

There are increasing demands for understanding deep neural networks'(DNNs) behavior
spurred by growing security and/or transparency concerns. Due to multi-layer nonlinearity of …

被引用次数：5 相关文章所有 5 个版本

[PDF] usenix.org

Xplain: Analyzing Invisible Correlations in Model Explanation

K Kumari, A Pegoraro, H Fereidooni… - 33rd USENIX Security …, 2024 - usenix.org

Explanation methods analyze the features in backdoored input data that contribute to model
misclassification. However, current methods like path techniques struggle to detect backdoor …