Thanks for nothing: Predicting zero-valued activations with lightweight convolutional neural networks
Convolutional neural networks (CNNs) introduce state-of-the-art results for various tasks
with the price of high computational demands. Inspired by the observation that spatial …
with the price of high computational demands. Inspired by the observation that spatial …
Nopeek-infer: Preventing face reconstruction attacks in distributed inference after on-premise training
P Vepakomma, A Singh, E Zhang… - 2021 16th IEEE …, 2021 - ieeexplore.ieee.org
For models trained on-premise but deployed in a distributed fashion across multiple entities,
we demonstrate that minimizing distance correlation between sensitive data such as faces …
we demonstrate that minimizing distance correlation between sensitive data such as faces …
LungVision: X-ray Imagery Classification for On-Edge Diagnosis Applications
This study presents a comprehensive analysis of utilizing TensorFlow Lite on mobile phones
for the on-edge medical diagnosis of lung diseases. This paper focuses on the technical …
for the on-edge medical diagnosis of lung diseases. This paper focuses on the technical …
Deep recursive embedding for high-dimensional data
Embedding high-dimensional data onto a low-dimensional manifold is of both theoretical
and practical value. In this article, we propose to combine deep neural networks (DNN) with …
and practical value. In this article, we propose to combine deep neural networks (DNN) with …
An Automatic Neural Network Architecture-and-Quantization Joint Optimization Framework for Efficient Model Inference
Efficient deep learning models, especially optimized for edge devices, benefit from low
inference latency to efficient energy consumption. Two classical techniques for efficient …
inference latency to efficient energy consumption. Two classical techniques for efficient …
MixQuant: Mixed Precision Quantization with a Bit-width Optimization Search
E Kloberdanz, W Le - arXiv preprint arXiv:2309.17341, 2023 - arxiv.org
Quantization is a technique for creating efficient Deep Neural Networks (DNNs), which
involves performing computations and storing tensors at lower bit-widths than f32 floating …
involves performing computations and storing tensors at lower bit-widths than f32 floating …
Split learning: a resource efficient model and data parallel approach for distributed deep learning
P Vepakomma, R Raskar - … : A Comprehensive Overview of Methods and …, 2022 - Springer
Resource constraints, workload overheads, lack of trust, and competition hinder the sharing
of raw data across multiple institutions. This leads to a shortage of data for training state-of …
of raw data across multiple institutions. This leads to a shortage of data for training state-of …
WaveQ: Gradient-based deep quantization of neural networks through sinusoidal adaptive regularization
As deep neural networks make their ways into different domains, their compute efficiency is
becoming a first-order constraint. Deep quantization, which reduces the bitwidth of the …
becoming a first-order constraint. Deep quantization, which reduces the bitwidth of the …
[PDF][PDF] Gradient-based deep quantization of neural networks through sinusoidal adaptive regularization
AT Elthakeb, P Pilligundla… - arXiv preprint arXiv …, 2020 - researchgate.net
As deep neural networks make their ways into different domains and application, their
compute efficiency is becoming a first-order constraint. Deep quantization, which reduces …
compute efficiency is becoming a first-order constraint. Deep quantization, which reduces …
Hardware-Friendly Lightweight Convolutional Neural Network Derivation at The Edge
L Zhang, AM Eltawil, KN Salama - 2024 IEEE 6th International …, 2024 - ieeexplore.ieee.org
Convolutional neural networks (CNNs) have demonstrated remarkable capability and
scalability in a variety of vision-related tasks. Due to privacy and latency constraints, in some …
scalability in a variety of vision-related tasks. Due to privacy and latency constraints, in some …