Configurable accelerator framework including a stream switch having a plurality of unidirectional stream links

T Boesch, G Desoli - US Patent 11,562,115, 2023 - Google Patents
US11562115B2 - Configurable accelerator framework including a stream switch having a plurality
of unidirectional stream links - Google Patents US11562115B2 - Configurable accelerator …

Communication optimizations for distributed machine learning

S Sridharan, K Vaidyanathan, D Das… - US Patent …, 2022 - Google Patents
Embodiments described herein provide a system to config ure distributed training of a neural
network, the system comprising memory to store a library to facilitate data transmission …

Digital integrated circuit for extracting features out of an input image based on cellular neural networks

L Yang, H Yu - US Patent 9,940,534, 2018 - Google Patents
Digital integrated circuit (IC) for extracting features out of input image is disclosed. The IC
contains one or more identical cellular neural networks (CNN) processing engines …

Deep vision processor

W Qadeer, R Hameed - US Patent 10,474,464, 2019 - Google Patents
Disclosed herein is a processor for deep learning. In one embodiment, the processor
comprises: a load and store unit configured to load and store image pixel data and stencil …

Hierarchical category classification scheme using multiple sets of fully-connected networks with a CNN based integrated circuit as feature extractor

L Yang, PZ Dong, B Sun - US Patent 10,366,302, 2019 - Google Patents
CNN based integrated circuit is configured with a set of pre-trained filter coefficients or
weights as a feature extractor of an input data. Multiple fully-connected networks (FCNs) are …

Low rank matrix compression

T Bar-On, J Subag, Y Fais, J Dreyfuss, G Novik… - US Patent …, 2021 - Google Patents
In an example, an apparatus comprises logic, at least partially including hardware logic, to
implement a lossy compression algorithm which utilizes a data transform and quantization …

Implementation of MobileNet in a CNN based digital integrated circuit

L Yang, PZ Dong, JZ Dong, B Sun - US Patent 10,360,470, 2019 - Google Patents
Method and systems of replacing operations of depthwise separable filters with first and
second replacement convolutional layers are disclosed. Depthwise separable filters …

Data structure for CNN based digital integrated circuit for extracting features out of an input image

L Yang, H Yu - US Patent 10,043,095, 2018 - Google Patents
Data arrangement schemes of imagery data and filter coefficients stored in a CNN based
digital IC for extracting features out of an input image are disclosed. The CNN based digital …

Implementation of ResNet in a CNN based digital integrated circuit

L Yang, PZ Dong, CJ Young, B Sun - US Patent 10,339,445, 2019 - Google Patents
Operations of a combination of first and second original convolutional layers followed by a
short path are replaced by operations of a set of three particular convolutional layers. The …

NVP: A flexible and efficient processor architecture for accelerating diverse computer vision tasks including DNN

Y Liu, F Wu, N Zhao, Q Zhang, W Wang… - … on Circuits and …, 2022 - ieeexplore.ieee.org
Compared with the CPUs and GPUs, the AI accelerators are able to achieve higher
performance and energy efficiency for accelerating the DNNs. However, besides the DNNs …