- 学术资源搜索

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer

Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

被引用次数：559 相关文章所有 2 个版本

[PDF] mdpi.com

Explainable ai: A review of machine learning interpretability methods

P Linardatos, V Papastefanopoulos, S Kotsiantis - Entropy, 2020 - mdpi.com

Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption,
with machine learning systems demonstrating superhuman performance in a significant …

被引用次数：2507 相关文章所有 12 个版本

[PDF] qub.ac.uk

[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.

B Wang, W Chen, H Pei, C Xie, M Kang, C Zhang, C Xu… - NeurIPS, 2023 - blogs.qub.ac.uk

Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …

被引用次数：341 相关文章所有 8 个版本

Promptbench: Towards evaluating the robustness of large language models on adversarial prompts

K Zhu, J Wang, J Zhou, Z Wang, H Chen… - arXiv e …, 2023 - ui.adsabs.harvard.edu

The increasing reliance on Large Language Models (LLMs) across academia and industry
necessitates a comprehensive understanding of their robustness to prompts. In response to …

被引用次数：225 相关文章所有 2 个版本

[HTML] sciencedirect.com

[HTML][HTML] Pre-trained models: Past, present and future

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - AI Open, 2021 - Elsevier

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …

被引用次数：863 相关文章所有 9 个版本

[PDF] arxiv.org

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

被引用次数：192 相关文章所有 3 个版本

[PDF] arxiv.org

Textattack: A framework for adversarial attacks, data augmentation, and adversarial training in nlp

JX Morris, E Lifland, JY Yoo, J Grigsby, D Jin… - arXiv preprint arXiv …, 2020 - arxiv.org

While there has been substantial research using adversarial attacks to analyze NLP models,
each attack is implemented in its own code repository. It remains challenging to develop …

被引用次数：748 相关文章所有 5 个版本

[PDF] arxiv.org

Bert-attack: Adversarial attack against bert using bert

L Li, R Ma, Q Guo, X Xue, X Qiu - arXiv preprint arXiv:2004.09984, 2020 - arxiv.org

Adversarial attacks for discrete data (such as texts) have been proved significantly more
challenging than continuous data (such as images) since it is difficult to generate adversarial …

被引用次数：700 相关文章所有 6 个版本

[PDF] arxiv.org

Adversarial glue: A multi-task benchmark for robustness evaluation of language models

B Wang, C Xu, S Wang, Z Gan, Y Cheng, J Gao… - arXiv preprint arXiv …, 2021 - arxiv.org

Large-scale pre-trained language models have achieved tremendous success across a
wide range of natural language understanding (NLU) tasks, even surpassing human …

被引用次数：201 相关文章所有 6 个版本

[PDF] arxiv.org

Open sesame! universal black box jailbreaking of large language models

R Lapid, R Langberg, M Sipper - arXiv preprint arXiv:2309.01446, 2023 - arxiv.org

Large language models (LLMs), designed to provide helpful and safe responses, often rely
on alignment techniques to align with user intent and social guidelines. Unfortunately, this …

被引用次数：115 相关文章所有 5 个版本