Thanks for nothing: Predicting zero-valued activations with lightweight convolutional neural networks

G Shomron, R Banner, M Shkolnik, U Weiser - Computer Vision–ECCV …, 2020 - Springer
Convolutional neural networks (CNNs) introduce state-of-the-art results for various tasks
with the price of high computational demands. Inspired by the observation that spatial …

Nopeek-infer: Preventing face reconstruction attacks in distributed inference after on-premise training

P Vepakomma, A Singh, E Zhang… - 2021 16th IEEE …, 2021 - ieeexplore.ieee.org
For models trained on-premise but deployed in a distributed fashion across multiple entities,
we demonstrate that minimizing distance correlation between sensitive data such as faces …

LungVision: X-ray Imagery Classification for On-Edge Diagnosis Applications

R Aldamani, DA Abuhani, T Shanableh - Algorithms, 2024 - mdpi.com
This study presents a comprehensive analysis of utilizing TensorFlow Lite on mobile phones
for the on-edge medical diagnosis of lung diseases. This paper focuses on the technical …

Deep recursive embedding for high-dimensional data

Z Zhou, X Zu, Y Wang, BPF Lelieveldt… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Embedding high-dimensional data onto a low-dimensional manifold is of both theoretical
and practical value. In this article, we propose to combine deep neural networks (DNN) with …

An Automatic Neural Network Architecture-and-Quantization Joint Optimization Framework for Efficient Model Inference

L Liu, Y Wang, X Zhao, W Chen, H Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Efficient deep learning models, especially optimized for edge devices, benefit from low
inference latency to efficient energy consumption. Two classical techniques for efficient …

MixQuant: Mixed Precision Quantization with a Bit-width Optimization Search

E Kloberdanz, W Le - arXiv preprint arXiv:2309.17341, 2023 - arxiv.org
Quantization is a technique for creating efficient Deep Neural Networks (DNNs), which
involves performing computations and storing tensors at lower bit-widths than f32 floating …

Split learning: a resource efficient model and data parallel approach for distributed deep learning

P Vepakomma, R Raskar - … : A Comprehensive Overview of Methods and …, 2022 - Springer
Resource constraints, workload overheads, lack of trust, and competition hinder the sharing
of raw data across multiple institutions. This leads to a shortage of data for training state-of …

WaveQ: Gradient-based deep quantization of neural networks through sinusoidal adaptive regularization

AT Elthakeb, P Pilligundla, F Mireshghallah… - arXiv preprint arXiv …, 2020 - arxiv.org
As deep neural networks make their ways into different domains, their compute efficiency is
becoming a first-order constraint. Deep quantization, which reduces the bitwidth of the …

[PDF][PDF] Gradient-based deep quantization of neural networks through sinusoidal adaptive regularization

AT Elthakeb, P Pilligundla… - arXiv preprint arXiv …, 2020 - researchgate.net
As deep neural networks make their ways into different domains and application, their
compute efficiency is becoming a first-order constraint. Deep quantization, which reduces …

Hardware-Friendly Lightweight Convolutional Neural Network Derivation at The Edge

L Zhang, AM Eltawil, KN Salama - 2024 IEEE 6th International …, 2024 - ieeexplore.ieee.org
Convolutional neural networks (CNNs) have demonstrated remarkable capability and
scalability in a variety of vision-related tasks. Due to privacy and latency constraints, in some …