M³vit: Mixture-of-experts vision transformer for efficient multi-task learning with model-accelerator co-design

Z Fan, R Sarkar, Z Jiang, T Chen… - Advances in …, 2022 - proceedings.neurips.cc
Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often
lets those tasks learn better jointly. Multi-tasking models have become successful and often …

Adamv-moe: Adaptive multi-task vision mixture-of-experts

T Chen, X Chen, X Du, A Rashwan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Sparsely activated Mixture-of-Experts (MoE) is becoming a promising paradigm for
multi-task learning (MTL). Instead of compressing multiple tasks' knowledge into a single …

Sparse moe as the new dropout: Scaling dense and self-slimmable transformers

T Chen, Z Zhang, A Jaiswal, S Liu, Z Wang - arXiv preprint arXiv …, 2023 - arxiv.org
Despite their remarkable achievement, gigantic transformers encounter significant
drawbacks, including exorbitant computational and memory footprints during training, as …

Bridging remote sensors with multisensor geospatial foundation models

B Han, S Zhang, X Shi… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
In the realm of geospatial analysis the diversity of remote sensors encompassing both
optical and microwave technologies offers a wealth of distinct observational capabilities …

Multimodal clinical trial outcome prediction with large language models

W Zheng, D Peng, H Xu, H Zhu, T Fu, H Yao - arXiv preprint arXiv …, 2024 - arxiv.org
The clinical trial is a pivotal and costly process, often spanning multiple years and requiring
substantial financial resources. Therefore, the development of clinical trial outcome …

DynaShare: task and instance conditioned parameter sharing for multi-task learning

E Rahimian, G Javadi, F Tung… - Proceedings of the …, 2023 - openaccess.thecvf.com
Multi-task networks rely on effective parameter sharing to achieve robust generalization
across tasks. In this paper, we present a novel parameter sharing method for multi-task …

Mncm: multi-level network cascades model for multi-task learning

H Wu - Proceedings of the 31st ACM International Conference …, 2022 - dl.acm.org
Recently, multi-task learning based on the deep neural network has been successfully
applied in many recommender system scenarios. The prediction quality of current …

Sparse moe as a new treatment: Addressing forgetting, fitting, learning issues in multi-modal multi-task learning

J Peng, K Zhou, R Zhou, T Hartvigsen, Y Zhang… - 2023 - openreview.net
Sparse Mixture-of-Experts (SMoE) is a promising paradigm that can be easily tailored for
multi-task learning. Its conditional computing nature allows us to organically allocate …

Exploiting graph structured cross-domain representation for multi-domain recommendation

A Ariza-Casabona, B Twardowski… - European Conference on …, 2023 - Springer
Multi-domain recommender systems benefit from cross-domain representation learning and
positive knowledge transfer. Both can be achieved by introducing a specific modeling of …

Multitask learning of a biophysically-detailed neuron model

J Verhellen, K Beshkov, S Amundsen… - PLOS Computational …, 2024 - journals.plos.org
The human brain operates at multiple levels, from molecules to circuits, and understanding
these complex processes requires integrated research efforts. Simulating biophysically …