Freematch: Self-adaptive thresholding for semi-supervised learning

Y Wang, H Chen, Q Heng, W Hou, Y Fan, Z Wu… - arXiv preprint arXiv …, 2022 - arxiv.org
Pseudo labeling and consistency regularization approaches with confidence-based
thresholding have made great progress in semi-supervised learning (SSL). In this paper, we …

Pandalm: An automatic evaluation benchmark for llm instruction tuning optimization

Y Wang, Z Yu, Z Zeng, L Yang, C Wang, H Chen… - arXiv preprint arXiv …, 2023 - arxiv.org
Instruction tuning large language models (LLMs) remains a challenging task, owing to the
complexity of hyperparameter selection and the difficulty involved in evaluating the tuned …

Dataperf: Benchmarks for data-centric ai development

M Mazumder, C Banbury, X Yao… - Advances in …, 2023 - proceedings.neurips.cc
Abstract Machine learning research has long focused on models rather than datasets, and
prominent datasets are used for common ML tasks without regard to the breadth, difficulty …

Softmatch: Addressing the quantity-quality trade-off in semi-supervised learning

H Chen, R Tao, Y Fan, Y Wang, J Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
The critical challenge of Semi-Supervised Learning (SSL) is how to effectively leverage the
limited labeled data and massive unlabeled data to improve the model's generalization …

[HTML][HTML] Self-training: A survey

MR Amini, V Feofanov, L Pauletto, L Hadjadj… - Neurocomputing, 2025 - Elsevier
Self-training methods have gained significant attention in recent years due to their
effectiveness in leveraging small labeled datasets and large unlabeled observations for …

Data-centric artificial intelligence: A survey

D Zha, ZP Bhat, KH Lai, F Yang, Z Jiang… - arXiv preprint arXiv …, 2023 - arxiv.org
Artificial Intelligence (AI) is making a profound impact in almost every domain. A vital enabler
of its great success is the availability of abundant and high-quality data for building machine …

Openstl: A comprehensive benchmark of spatio-temporal predictive learning

C Tan, S Li, Z Gao, W Guan, Z Wang… - Advances in …, 2023 - proceedings.neurips.cc
Spatio-temporal predictive learning is a learning paradigm that enables models to learn
spatial and temporal patterns by predicting future frames from given past frames in an …

Flatmatch: Bridging labeled data and unlabeled data with cross-sharpness for semi-supervised learning

Z Huang, L Shen, J Yu, B Han… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract Semi-Supervised Learning (SSL) has been an effective way to leverage abundant
unlabeled data with extremely scarce labeled data. However, most SSL methods are …

Iomatch: Simplifying open-set semi-supervised learning with joint inliers and outliers utilization

Z Li, L Qi, Y Shi, Y Gao - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Semi-supervised learning (SSL) aims to leverage massive unlabeled data when labels are
expensive to obtain. Unfortunately, in many real-world applications, the collected unlabeled …

Scimine: An efficient systematic prioritization model based on richer semantic information

F Guo, Y Luo, L Yang, Y Zhang - … of the 46th International ACM SIGIR …, 2023 - dl.acm.org
Systematic review is a crucial method that has been widely used. by scholars from different
research domains. However, screening for relevant scientific literature from paper …