Applying lightweight soft error mitigation techniques to embedded mixed precision deep neural networks

G Abich, J Gava, R Garibotti, R Reis… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Deep neural networks (DNNs) are being incorporated in resource-constrained IoT devices,
which typically rely on reduced memory footprint and low-performance processors. While …

HAFT: Hardware-assisted fault tolerance

D Kuvaiskii, R Faqeh, P Bhatotia, P Felber… - Proceedings of the …, 2016 - dl.acm.org
Transient hardware faults during the execution of a program can cause data corruptions. We
present HAFT, a fault tolerance technique using hardware extensions of commodity CPUs to …

Lightweight checkpoint technique for resilience against soft errors

M Didehban, SRD Lokam, A Shrivastava - US Patent 10,997,027, 2021 - Google Patents
Abstract Systems and methods for implementing a lightweight checkpoint technique for
resilience against soft errors are disclosed. The technique provides effective, safe, and …

Towards dynamic dependable systems through evidence-based continuous certification

R Faqeh, C Fetzer, H Hermanns, J Hoffmann… - … Applications of Formal …, 2020 - Springer
Future cyber-physical systems are expected to be dynamic, evolving while already being
deployed. Frequent updates of software components are likely to become the norm even for …

Asymmetric resilience: Exploiting task-level idempotency for transient error recovery in accelerator-based systems

J Leng, A Buyuktosunoglu, R Bertran… - … Symposium on High …, 2020 - ieeexplore.ieee.org
Accelerators make the task of building systems that are re-silient against transient errors like
voltage noise and soft errors hard. Architects integrate accelerators into the system as black …

NEMESIS: A software approach for computing in presence of soft errors

M Didehban, A Shrivastava… - 2017 IEEE/ACM …, 2017 - ieeexplore.ieee.org
Soft errors are considered as the main reliability challenge for sub-nanoscale
microprocessors. Software-level soft error resilience schemes are desirable because they …

Dataflow model–based software synthesis framework for parallel and distributed embedded systems

E Jeong, D Jeong, S Ha - ACM Transactions on Design Automation of …, 2021 - dl.acm.org
Existing software development methodologies mostly assume that an application runs on a
single device without concern about the non-functional requirements of an embedded …

More: Model-based redundancy for simulink

K Ding, A Morozov, K Janschek - … September 19-21, 2018, Proceedings 37, 2018 - Springer
Fault tolerance plays a significant role in the safety-critical system design that enables a
system to continue performing its intended functions in presence of faults. Redundancy is …

Efficient fault tolerance using Intel MPX and TSX

O Oleksenko, D Kuvaiskii, P Bhatotia, C Fetzer… - Fast Abstract in the 46th …, 2016 - hal.science
Hardware faults can cause data corruptions during computation, and they are especially
harmful if these corruptions happen in data pointers. Existing solutions, however, incur high …

FERNANDO: A software transient fault tolerance approach for embedded systems based on redundant multi-threading

H Wu, R Guo, Y Hu - IEEE Access, 2021 - ieeexplore.ieee.org
As semiconductor technology scales, modern microprocessors are more vulnerable to
transient faults. Software-level fault tolerance schemes are promising because they can …