Embedded deep learning accelerators: A survey on recent advances
The exponential increase in generated data as well as the advances in high-performance
computing has paved the way for the use of complex machine learning methods. Indeed, the …
computing has paved the way for the use of complex machine learning methods. Indeed, the …
Mimonets: Multiple-input-multiple-output neural networks exploiting computation in superposition
With the advent of deep learning, progressively larger neural networks have been designed
to solve complex tasks. We take advantage of these capacity-rich models to lower the cost of …
to solve complex tasks. We take advantage of these capacity-rich models to lower the cost of …
MIMMO: Multi-Input Massive Multi-Output Neural Network
M Ferianc, M Rodrigues - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Neural networks (NNs) have achieved superhuman accuracy in multiple tasks, but NNs
predictions' certainty is often debatable, especially if confronted with out of training …
predictions' certainty is often debatable, especially if confronted with out of training …
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs
State-of-the-art large language models (LLMs) have become indispensable tools for various
tasks. However, training LLMs to serve as effective assistants for humans requires careful …
tasks. However, training LLMs to serve as effective assistants for humans requires careful …
TextMixer: Mixing Multiple Inputs for Privacy-Preserving Inference
Pre-trained language models (PLMs) are often deployed as cloud services, enabling users
to upload textual data and perform inference remotely. However, users' personal text often …
to upload textual data and perform inference remotely. However, users' personal text often …
MOSEL: Inference Serving Using Dynamic Modality Selection
Rapid advancements over the years have helped machine learning models reach
previously hard-to-achieve goals, sometimes even exceeding human capabilities. However …
previously hard-to-achieve goals, sometimes even exceeding human capabilities. However …
Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules
Pre-trained language models (PLMs) have achieved remarkable results on NLP tasks but at
the expense of huge parameter sizes and the consequent computational costs. In this paper …
the expense of huge parameter sizes and the consequent computational costs. In this paper …
Mux-plms: Data multiplexing for high-throughput language models
The widespread adoption of large language models such as ChatGPT and Bard has led to
unprecedented demand for these technologies. The burgeoning cost of inference for ever …
unprecedented demand for these technologies. The burgeoning cost of inference for ever …
PruMUX: Augmenting Data Multiplexing with Model Compression
As language models increase in size by the day, methods for efficient inference are critical to
leveraging their capabilities for various applications. Prior work has investigated techniques …
leveraging their capabilities for various applications. Prior work has investigated techniques …
SAE: Single Architecture Ensemble Neural Networks
Ensembles of separate neural networks (NNs) have shown superior accuracy and
confidence calibration over single NN across tasks. Recent methods compress ensembles …
confidence calibration over single NN across tasks. Recent methods compress ensembles …