关注
Hao Sun
Hao Sun
在 mails.tsinghua.edu.cn 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Safety assessment of chinese large language models
H Sun, Z Zhang, J Deng, J Cheng, M Huang
arXiv preprint arXiv:2304.10436, 2023
612023
On the safety of conversational models: Taxonomy, dataset, and benchmark
H Sun, G Xu, J Deng, J Cheng, C Zheng, H Zhou, N Peng, X Zhu, ...
arXiv preprint arXiv:2110.08466, 2021
552021
COLD: A benchmark for Chinese offensive language detection
J Deng, J Zhou, H Sun, C Zheng, F Mi, H Meng, M Huang
arXiv preprint arXiv:2201.06025, 2022
542022
Psyqa: A chinese dataset for generating long counseling text for mental health support
H Sun, Z Lin, C Zheng, S Liu, M Huang
arXiv preprint arXiv:2106.01702, 2021
492021
Eva: An open-domain chinese dialogue system with large-scale generative pre-training
H Zhou, P Ke, Z Zhang, Y Gu, Y Zheng, C Zheng, Y Wang, CH Wu, H Sun, ...
arXiv preprint arXiv:2108.01547, 2021
452021
Eva2. 0: Investigating open-domain chinese dialogue systems with large-scale pre-training
Y Gu, J Wen, H Sun, Y Song, P Ke, C Zheng, Z Zhang, J Yao, L Liu, X Zhu, ...
Machine Intelligence Research 20 (2), 207-219, 2023
372023
Recent advances towards safe, responsible, and moral dialogue systems: A survey
J Deng, H Sun, Z Zhang, J Cheng, M Huang
arXiv preprint arXiv:2302.09270 1, 2023
262023
Unveiling the implicit toxicity in large language models
J Wen, P Ke, H Sun, Z Zhang, C Li, J Bai, M Huang
arXiv preprint arXiv:2311.17391, 2023
172023
Pal: Persona-augmented emotional support conversation generation
J Cheng, S Sabour, H Sun, Z Chen, M Huang
arXiv preprint arXiv:2212.09235, 2022
142022
MoralDial: A framework to train and evaluate moral dialogue systems via moral discussions
H Sun, Z Zhang, F Mi, Y Wang, W Liu, J Cui, B Wang, Q Liu, M Huang
arXiv preprint arXiv:2212.10720, 2022
92022
Constructing highly inductive contexts for dialogue safety through controllable reverse generation
Z Zhang, J Cheng, H Sun, J Deng, F Mi, Y Wang, L Shang, M Huang
arXiv preprint arXiv:2212.01810, 2022
72022
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
Z Zhang, Y Lu, J Ma, D Zhang, R Li, P Ke, H Sun, L Sha, Z Sui, H Wang, ...
arXiv preprint arXiv:2402.16444, 2024
22024
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning
Z Zhang, J Cheng, H Sun, J Deng, M Huang
Findings of the Association for Computational Linguistics: EMNLP 2023, 10421 …, 2023
22023
Enhancing Offensive Language Detection with Data Augmentation and Knowledge Distillation
J Deng, Z Chen, H Sun, Z Zhang, J Wu, S Nakagawa, F Ren, M Huang
Research 6, 0189, 2023
12023
Moraldial: A framework to train and evaluate moral dialogue systems via constructing moral discussions
H Sun, Z Zhang, F Mi, Y Wang, W Liu, J Cui, B Wang, Q Liu, M Huang
arXiv preprint arXiv:2212.10720, 2022
12022
系统目前无法执行此操作,请稍后再试。
文章 1–15