Enabling resource-efficient aiot system with cross-level optimization: A survey
The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …
widespread use of intelligent infrastructures and the impressive success of deep learning …
Tileflow: A framework for modeling fusion dataflow via tree-based analysis
With the increasing size of DNN models and the growing discrepancy between compute
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …
Chimera: An analytical optimizing framework for effective compute-intensive operators fusion
Machine learning models with various tensor operators are becoming ubiquitous in recent
years. There are two types of operators in machine learning: compute-intensive operators …
years. There are two types of operators in machine learning: compute-intensive operators …
Welder: Scheduling deep learning memory access via tile-graph
With the growing demand for processing higher fidelity data and the use of faster computing
cores in newer hardware accelerators, modern deep neural networks (DNNs) are becoming …
cores in newer hardware accelerators, modern deep neural networks (DNNs) are becoming …
Whale: Efficient giant model training over heterogeneous {GPUs}
The scaling up of deep neural networks has been demonstrated to be effective in improving
model quality, but also encompasses several training challenges in terms of training …
model quality, but also encompasses several training challenges in terms of training …
A tensor compiler with automatic data packing for simple and efficient fully homomorphic encryption
Fully Homomorphic Encryption (FHE) enables computing on encrypted data, letting clients
securely offload computation to untrusted servers. While enticing, FHE has two key …
securely offload computation to untrusted servers. While enticing, FHE has two key …
Aspen: Breaking operator barriers for efficient parallelization of deep neural networks
Abstract Modern Deep Neural Network (DNN) frameworks use tensor operators as the main
building blocks of DNNs. However, we observe that operator-based construction of DNNs …
building blocks of DNNs. However, we observe that operator-based construction of DNNs …
Bladedisc: Optimizing dynamic shape machine learning workloads via compiler approach
Compiler optimization plays an increasingly important role to boost the performance of
machine learning models for data processing and management. With increasingly complex …
machine learning models for data processing and management. With increasingly complex …
Drew: Efficient winograd cnn inference with deep reuse
Deep learning has been used in various domains, including Web services. Convolutional
neural networks (CNNs), which are deep learning representatives, are among the most …
neural networks (CNNs), which are deep learning representatives, are among the most …
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Dynamic shape computations have become critical in modern machine learning workloads,
especially in emerging large language models. The success of these models has driven …
especially in emerging large language models. The success of these models has driven …