A survey of safety and trustworthiness of large language models through the lens of verification and validation
Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …
engage end-users in human-level conversations with detailed and articulate answers across …
Certified policy smoothing for cooperative multi-agent reinforcement learning
Cooperative multi-agent reinforcement learning (c-MARL) is widely applied in safety-critical
scenarios, thus the analysis of robustness for c-MARL models is profoundly important …
scenarios, thus the analysis of robustness for c-MARL models is profoundly important …
Generalizing universal adversarial perturbations for deep neural networks
Previous studies have shown that universal adversarial attacks can fool deep neural
networks over a large set of input images with a single human-invisible perturbation …
networks over a large set of input images with a single human-invisible perturbation …
Towards verifying the geometric robustness of large-scale neural networks
Deep neural networks (DNNs) are known to be vulnerable to adversarial geometric
transformation. This paper aims to verify the robustness of large-scale DNNs against the …
transformation. This paper aims to verify the robustness of large-scale DNNs against the …
Reward Certification for Policy Smoothed Reinforcement Learning
Reinforcement Learning (RL) has achieved remarkable success in safety-critical areas, but it
can be weakened by adversarial attacks. Recent studies have introduced``smoothed …
can be weakened by adversarial attacks. Recent studies have introduced``smoothed …
Sora: Scalable black-box reachability analyser on neural networks
The vulnerability of deep neural networks (DNNs) to input perturbations has posed a
significant challenge. Recent work on robustness verification of DNNs not only lacks …
significant challenge. Recent work on robustness verification of DNNs not only lacks …
Boosting Adversarial Training via Fisher-Rao Norm-based Regularization
X Yin, W Ruan - Proceedings of the IEEE/CVF Conference …, 2024 - openaccess.thecvf.com
Adversarial training is extensively utilized to improve the adversarial robustness of deep
neural networks. Yet mitigating the degradation of standard generalization performance in …
neural networks. Yet mitigating the degradation of standard generalization performance in …
[HTML][HTML] Bridging formal methods and machine learning with model checking and global optimisation
Formal methods and machine learning are two research fields with drastically different
foundations and philosophies. Formal methods utilise mathematically rigorous techniques …
foundations and philosophies. Formal methods utilise mathematically rigorous techniques …
Representation-Based Robustness in Goal-Conditioned Reinforcement Learning
While Goal-Conditioned Reinforcement Learning (GCRL) has gained attention, its
algorithmic robustness against adversarial perturbations remains unexplored. The attacks …
algorithmic robustness against adversarial perturbations remains unexplored. The attacks …
TextVerifier: Robustness Verification for Textual Classifiers with Certifiable Guarantees
S Sun, W Ruan - Findings of the Association for Computational …, 2023 - aclanthology.org
When textual classifiers are deployed in safety-critical workflows, they must withstand the
onslaught of AI-enabled model confusion caused by adversarial examples with minor …
onslaught of AI-enabled model confusion caused by adversarial examples with minor …