Datamux: Data multiplexing for neural networks

G Akkad, A Mansour, E Inaty - IEEE Transactions on Artificial …, 2023 - ieeexplore.ieee.org

The exponential increase in generated data as well as the advances in high-performance
computing has paved the way for the use of complex machine learning methods. Indeed, the …

被引用次数：7 相关文章所有 4 个版本

[PDF] neurips.cc

Mimonets: Multiple-input-multiple-output neural networks exploiting computation in superposition

N Menet, M Hersche, G Karunaratne… - Advances in …, 2024 - proceedings.neurips.cc

With the advent of deep learning, progressively larger neural networks have been designed
to solve complex tasks. We take advantage of these capacity-rich models to lower the cost of …

被引用次数：5 相关文章所有 7 个版本

[PDF] thecvf.com

MIMMO: Multi-Input Massive Multi-Output Neural Network

M Ferianc, M Rodrigues - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Neural networks (NNs) have achieved superhuman accuracy in multiple tasks, but NNs
predictions' certainty is often debatable, especially if confronted with out of training …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs

S Chaudhari, P Aggarwal, V Murahari… - arXiv preprint arXiv …, 2024 - arxiv.org

State-of-the-art large language models (LLMs) have become indispensable tools for various
tasks. However, training LLMs to serve as effective assistants for humans requires careful …

被引用次数：9 相关文章所有 2 个版本

[PDF] aclanthology.org

TextMixer: Mixing Multiple Inputs for Privacy-Preserving Inference

X Zhou, Y Lu, R Ma, T Gui, Q Zhang… - Findings of the …, 2023 - aclanthology.org

Pre-trained language models (PLMs) are often deployed as cloud services, enabling users
to upload textual data and perform inference remotely. However, users' personal text often …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

MOSEL: Inference Serving Using Dynamic Modality Selection

B Hu, L Xu, J Moon, NJ Yadwadkar, A Akella - arXiv preprint arXiv …, 2023 - arxiv.org

Rapid advancements over the years have helped machine learning models reach
previously hard-to-achieve goals, sometimes even exceeding human capabilities. However …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules

C Xiao, Y Luo, W Zhang, P Zhang, X Han, Y Lin… - arXiv preprint arXiv …, 2023 - arxiv.org

Pre-trained language models (PLMs) have achieved remarkable results on NLP tasks but at
the expense of huge parameter sizes and the consequent computational costs. In this paper …

Mux-plms: Data multiplexing for high-throughput language models

V Murahari, A Deshpande, CE Jimenez… - arXiv preprint arXiv …, 2023 - arxiv.org

The widespread adoption of large language models such as ChatGPT and Bard has led to
unprecedented demand for these technologies. The burgeoning cost of inference for ever …

被引用次数：1 相关文章所有 7 个版本

[PDF] arxiv.org

PruMUX: Augmenting Data Multiplexing with Model Compression

Y Su, V Murahari, K Narasimhan, K Li - arXiv preprint arXiv:2305.14706, 2023 - arxiv.org

As language models increase in size by the day, methods for efficient inference are critical to
leveraging their capabilities for various applications. Prior work has investigated techniques …

被引用次数：1 相关文章所有 5 个版本

[PDF] arxiv.org

SAE: Single Architecture Ensemble Neural Networks

M Ferianc, H Fan, M Rodrigues - arXiv preprint arXiv:2402.06580, 2024 - arxiv.org

Ensembles of separate neural networks (NNs) have shown superior accuracy and
confidence calibration over single NN across tasks. Recent methods compress ensembles …