Trustworthy AI: From principles to practices

B Li, P Qi, B Liu, S Di, J Liu, J Pei, J Yi… - ACM Computing Surveys, 2023 - dl.acm.org
The rapid development of Artificial Intelligence (AI) technology has enabled the deployment
of various systems based on it. However, many current AI systems are found vulnerable to …

Opportunities and challenges in deep learning adversarial robustness: A survey

SH Silva, P Najafirad - arXiv preprint arXiv:2007.00753, 2020 - arxiv.org
As we seek to deploy machine learning models beyond virtual and controlled domains, it is
critical to analyze not only the accuracy or the fact that it works most of the time, but if such a …

Robustbench: a standardized adversarial robustness benchmark

F Croce, M Andriushchenko, V Sehwag… - arXiv preprint arXiv …, 2020 - arxiv.org
As a research community, we are still lacking a systematic understanding of the progress on
adversarial robustness which often makes it hard to identify the most promising ideas in …

Trustllm: Trustworthiness in large language models

L Sun, Y Huang, H Wang, S Wu, Q Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

F Croce, M Hein - International conference on machine …, 2020 - proceedings.mlr.press
The field of defense strategies against adversarial attacks has significantly grown over the
last years, but progress is hampered as the evaluation of adversarial defenses is often …

Overfitting in adversarially robust deep learning

L Rice, E Wong, Z Kolter - International conference on …, 2020 - proceedings.mlr.press
It is common practice in deep learning to use overparameterized networks and train for as
long as possible; there are numerous studies that show, both theoretically and empirically …

Uncovering the limits of adversarial training against norm-bounded adversarial examples

S Gowal, C Qin, J Uesato, T Mann, P Kohli - arXiv preprint arXiv …, 2020 - arxiv.org
Adversarial training and its variants have become de facto standards for learning robust
deep neural networks. In this paper, we explore the landscape around adversarial training in …

Beta-crown: Efficient bound propagation with per-neuron split constraints for neural network robustness verification

S Wang, H Zhang, K Xu, X Lin, S Jana… - Advances in …, 2021 - proceedings.neurips.cc
Bound propagation based incomplete neural network verifiers such as CROWN are very
efficient and can significantly accelerate branch-and-bound (BaB) based complete …

Rethinking lipschitz neural networks and certified robustness: A boolean function perspective

B Zhang, D Jiang, D He… - Advances in neural …, 2022 - proceedings.neurips.cc
Designing neural networks with bounded Lipschitz constant is a promising way to obtain
certifiably robust classifiers against adversarial examples. However, the relevant progress …

Attacks which do not kill training make adversarial learning stronger

J Zhang, X Xu, B Han, G Niu, L Cui… - International …, 2020 - proceedings.mlr.press
Adversarial training based on the minimax formulation is necessary for obtaining adversarial
robustness of trained models. However, it is conservative or even pessimistic so that it …