Hasco: Towards agile hardware and software co-design for tensor computation

L Lu, Y Jin, H Bi, Z Luo, P Li, T Wang… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

In recent years, attention-based models have achieved impressive performance in natural
language processing and computer vision applications by effectively capturing contextual …

被引用次数：96 相关文章

[PDF] acm.org

AMOS: enabling automatic mapping for tensor computations on spatial accelerators with hardware abstraction

S Zheng, R Chen, A Wei, Y Jin, Q Han, L Lu… - Proceedings of the 49th …, 2022 - dl.acm.org

Hardware specialization is a promising trend to sustain performance growth. Spatial
hardware accelerators that employ specialized and hierarchical computation and memory …

被引用次数：56 相关文章所有 3 个版本

[PDF] arxiv.org

Tenet: A framework for modeling tensor dataflow based on relation-centric notation

L Lu, N Guan, Y Wang, L Jia, Z Luo… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

Accelerating tensor applications on spatial architectures provides high performance and
energy-efficiency, but requires accurate performance models for evaluating various dataflow …

被引用次数：63 相关文章所有 7 个版本

[PDF] acm.org

Tileflow: A framework for modeling fusion dataflow via tree-based analysis

S Zheng, S Chen, S Gao, L Jia, G Sun… - Proceedings of the 56th …, 2023 - dl.acm.org

With the increasing size of DNN models and the growing discrepancy between compute
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …

被引用次数：13 相关文章所有 5 个版本

[PDF] acm.org

Inter-layer scheduling space definition and exploration for tiled accelerators

J Cai, Y Wei, Z Wu, S Peng, K Ma - Proceedings of the 50th Annual …, 2023 - dl.acm.org

With the continuous expansion of the DNN accelerator scale, inter-layer scheduling, which
studies the allocation of computing resources to each layer and the computing order of all …

被引用次数：25 相关文章所有 2 个版本

[PDF] google.com

Chimera: An analytical optimizing framework for effective compute-intensive operators fusion

S Zheng, S Chen, P Song, R Chen, X Li… - … Symposium on High …, 2023 - ieeexplore.ieee.org

Machine learning models with various tensor operators are becoming ubiquitous in recent
years. There are two types of operators in machine learning: compute-intensive operators …

被引用次数：23 相关文章所有 4 个版本

[PDF] acm.org

Dosa: Differentiable model-based one-loop search for dnn accelerators

C Hong, Q Huang, G Dinh, M Subedar… - Proceedings of the 56th …, 2023 - dl.acm.org

In the hardware design space exploration process, it is critical to optimize both hardware
parameters and algorithm-to-hardware mappings. Previous work has largely approached …

被引用次数：14 相关文章所有 5 个版本

Large circuit models: opportunities and challenges

L Chen, Y Chen, Z Chu, W Fang, TY Ho… - Science China …, 2024 - Springer

Within the electronic design automation (EDA) domain, artificial intelligence (AI)-driven
solutions have emerged as formidable tools, yet they typically augment rather than redefine …

被引用次数：1 相关文章所有 2 个版本

[PDF] ieee.org

High-level synthesis hardware design for fpga-based accelerators: Models, methodologies, and frameworks

RS Molina, V Gil-Costa, ML Crespo, G Ramponi - IEEE Access, 2022 - ieeexplore.ieee.org

Hardware accelerators based on field programmable gate array (FPGA) and system on chip
(SoC) devices have gained attention in recent years. One of the main reasons is that these …

被引用次数：30 相关文章所有 8 个版本

[PDF] acm.org

Telamalloc: Efficient on-chip memory allocation for production machine learning accelerators

M Maas, U Beaugnon, A Chauhan, B Ilbeyi - Proceedings of the 28th …, 2022 - dl.acm.org

Memory buffer allocation for on-chip memories is a major challenge in modern machine
learning systems that target ML accelerators. In interactive systems such as mobile phones …

被引用次数：21 相关文章所有 2 个版本