- 学术资源搜索

How to dp-fy ml: A practical guide to machine learning with differential privacy

N Ponomareva, H Hazimeh, A Kurakin, Z Xu… - Journal of Artificial …, 2023 - jair.org

Abstract Machine Learning (ML) models are ubiquitous in real-world applications and are a
constant focus of research. Modern ML models have become more complex, deeper, and …

被引用次数：109 相关文章所有 5 个版本

A state-of-the-art survey on solving non-iid data in federated learning

X Ma, J Zhu, Z Lin, S Chen, Y Qin - Future Generation Computer Systems, 2022 - Elsevier

Federated Learning (FL) proposed in recent years has received significant attention from
researchers in that it can enable multiple clients to cooperatively train global models without …

被引用次数：210 相关文章所有 2 个版本

[PDF] neurips.cc

Hidden progress in deep learning: Sgd learns parities near the computational limit

B Barak, B Edelman, S Goel… - Advances in …, 2022 - proceedings.neurips.cc

There is mounting evidence of emergent phenomena in the capabilities of deep learning
methods as we scale up datasets, model sizes, and training times. While there are some …

被引用次数：109 相关文章所有 8 个版本

[PDF] arxiv.org

Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models

X Xie, P Zhou, H Li, Z Lin, S Yan - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

In deep learning, different kinds of deep networks typically need different optimizers, which
have to be chosen after multiple trials, making the training process inefficient. To relieve this …

被引用次数：107 相关文章所有 4 个版本

[PDF] aaai.org

Personalized cross-silo federated learning on non-iid data

Y Huang, L Chu, Z Zhou, L Wang, J Liu, J Pei… - Proceedings of the …, 2021 - ojs.aaai.org

Non-IID data present a tough challenge for federated learning. In this paper, we explore a
novel idea of facilitating pairwise collaborations between clients with similar data. We …

被引用次数：577 相关文章所有 10 个版本

[PDF] thecvf.com

Seeing out of the box: End-to-end pre-training for vision-language representation learning

Z Huang, Z Zeng, Y Huang, B Liu… - Proceedings of the …, 2021 - openaccess.thecvf.com

We study on joint learning of Convolutional Neural Network (CNN) and Transformer for
vision-language pre-training (VLPT) which aims to learn cross-modal alignments from …

被引用次数：282 相关文章所有 6 个版本

[PDF] neurips.cc

Scan and snap: Understanding training dynamics and token composition in 1-layer transformer

Y Tian, Y Wang, B Chen, SS Du - Advances in Neural …, 2023 - proceedings.neurips.cc

Transformer architecture has shown impressive performance in multiple research domains
and has become the backbone of many neural network models. However, there is limited …

被引用次数：52 相关文章所有 10 个版本

[PDF] arxiv.org

Sophia: A scalable stochastic second-order optimizer for language model pre-training

H Liu, Z Li, D Hall, P Liang, T Ma - arXiv preprint arXiv:2305.14342, 2023 - arxiv.org

Given the massive cost of language model pre-training, a non-trivial improvement of the
optimization algorithm would lead to a material reduction on the time and cost of training …

被引用次数：88 相关文章所有 4 个版本

[PDF] neurips.cc

Vision transformers provably learn spatial structure

S Jelassi, M Sander, Y Li - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Abstract Vision Transformers (ViTs) have recently achieved comparable or superior
performance to Convolutional neural networks (CNNs) in computer vision. This empirical …

被引用次数：68 相关文章所有 6 个版本

[PDF] neurips.cc

Towards theoretically understanding why sgd generalizes better than adam in deep learning

P Zhou, J Feng, C Ma, C Xiong… - Advances in Neural …, 2020 - proceedings.neurips.cc

It is not clear yet why ADAM-alike adaptive gradient algorithms suffer from worse
generalization performance than SGD despite their faster training speed. This work aims to …

被引用次数：274 相关文章所有 8 个版本