A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G Jin, Y Dong… - Artificial Intelligence …, 2024 - Springer
Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

Certified policy smoothing for cooperative multi-agent reinforcement learning

R Mu, W Ruan, LS Marcolino, G Jin, Q Ni - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Cooperative multi-agent reinforcement learning (c-MARL) is widely applied in safety-critical
scenarios, thus the analysis of robustness for c-MARL models is profoundly important …

Generalizing universal adversarial perturbations for deep neural networks

Y Zhang, W Ruan, F Wang, X Huang - Machine Learning, 2023 - Springer
Previous studies have shown that universal adversarial attacks can fool deep neural
networks over a large set of input images with a single human-invisible perturbation …

Towards verifying the geometric robustness of large-scale neural networks

F Wang, P Xu, W Ruan, X Huang - … of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
Deep neural networks (DNNs) are known to be vulnerable to adversarial geometric
transformation. This paper aims to verify the robustness of large-scale DNNs against the …

Reward Certification for Policy Smoothed Reinforcement Learning

R Mu, LS Marcolino, Y Zhang, T Zhang… - Proceedings of the …, 2024 - ojs.aaai.org
Reinforcement Learning (RL) has achieved remarkable success in safety-critical areas, but it
can be weakened by adversarial attacks. Recent studies have introduced``smoothed …

Sora: Scalable black-box reachability analyser on neural networks

P Xu, F Wang, W Ruan, C Zhang… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
The vulnerability of deep neural networks (DNNs) to input perturbations has posed a
significant challenge. Recent work on robustness verification of DNNs not only lacks …

Boosting Adversarial Training via Fisher-Rao Norm-based Regularization

X Yin, W Ruan - Proceedings of the IEEE/CVF Conference …, 2024 - openaccess.thecvf.com
Adversarial training is extensively utilized to improve the adversarial robustness of deep
neural networks. Yet mitigating the degradation of standard generalization performance in …

[HTML][HTML] Bridging formal methods and machine learning with model checking and global optimisation

S Bensalem, X Huang, W Ruan, Q Tang, C Wu… - Journal of Logical and …, 2024 - Elsevier
Formal methods and machine learning are two research fields with drastically different
foundations and philosophies. Formal methods utilise mathematically rigorous techniques …

Representation-Based Robustness in Goal-Conditioned Reinforcement Learning

X Yin, S Wu, J Liu, M Fang, X Zhao, X Huang… - Proceedings of the …, 2024 - ojs.aaai.org
While Goal-Conditioned Reinforcement Learning (GCRL) has gained attention, its
algorithmic robustness against adversarial perturbations remains unexplored. The attacks …

TextVerifier: Robustness Verification for Textual Classifiers with Certifiable Guarantees

S Sun, W Ruan - Findings of the Association for Computational …, 2023 - aclanthology.org
When textual classifiers are deployed in safety-critical workflows, they must withstand the
onslaught of AI-enabled model confusion caused by adversarial examples with minor …