Robust mixture-of-expert training for convolutional neural networks

Y Zhang, R Cai, T Chen, G Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Sparsely-gated Mixture of Expert (MoE), an emerging deep model architecture, has
demonstrated a great promise to enable high-accuracy and ultra-efficient model inference …

Selectivity drives productivity: efficient dataset pruning for enhanced transfer learning

Y Zhang, Y Zhang, A Chen, J Liu… - Advances in …, 2024 - proceedings.neurips.cc
Massive data is often considered essential for deep learning applications, but it also incurs
significant computational and infrastructural costs. Therefore, dataset pruning (DP) has …

Constrained bi-level optimization: Proximal lagrangian value function approach and hessian-free algorithm

W Yao, C Yu, S Zeng, J Zhang - arXiv preprint arXiv:2401.16164, 2024 - arxiv.org
This paper presents a new approach and algorithm for solving a class of constrained Bi-
Level Optimization (BLO) problems in which the lower-level problem involves constraints …

Soul: Unlocking the power of second-order optimization for llm unlearning

J Jia, Y Zhang, Y Zhang, J Liu, B Runwal… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have highlighted the necessity of effective unlearning
mechanisms to comply with data regulations and ethical AI practices. LLM unlearning aims …

Principled penalty-based methods for bilevel reinforcement learning and rlhf

H Shen, Z Yang, T Chen - arXiv preprint arXiv:2402.06886, 2024 - arxiv.org
Bilevel optimization has been recently applied to many machine learning tasks. However,
their applications have been restricted to the supervised learning setting, where static …

Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-loop and Hessian-free Solution Strategy

R Liu, Z Liu, W Yao, S Zeng, J Zhang - arXiv preprint arXiv:2405.09927, 2024 - arxiv.org
This work focuses on addressing two major challenges in the context of large-scale
nonconvex Bi-Level Optimization (BLO) problems, which are increasingly applied in …

Challenging forgets: Unveiling the worst-case forget sets in machine unlearning

C Fan, J Liu, A Hero, S Liu - arXiv preprint arXiv:2403.07362, 2024 - arxiv.org
The trustworthy machine learning (ML) community is increasingly recognizing the crucial
need for models capable of selectively'unlearning'data points after training. This leads to the …

Federated Learning Can Find Friends That Are Beneficial

N Tupitsa, S Horváth, M Takáč, E Gorbunov - arXiv preprint arXiv …, 2024 - arxiv.org
In Federated Learning (FL), the distributed nature and heterogeneity of client data present
both opportunities and challenges. While collaboration among clients can significantly …

Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity

B Kong, S Zhu, S Lu, X Huang, K Yuan - arXiv preprint arXiv:2402.03167, 2024 - arxiv.org
Stochastic bilevel optimization (SBO) is becoming increasingly essential in machine
learning due to its versatility in handling nested structures. To address large-scale SBO …

A multiscale Consensus-Based algorithm for multi-level optimization

M Herty, Y Huang, D Kalise, H Kouhkouh - arXiv preprint arXiv …, 2024 - arxiv.org
A novel multiscale consensus-based optimization (CBO) algorithm for solving bi-and tri-level
optimization problems is introduced. Existing CBO techniques are generalized by the …