A survey of techniques for optimizing transformer inference
Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …
transformer neural networks. The family of transformer networks, including Bidirectional …
Run, don't walk: chasing higher FLOPS for faster neural networks
To design fast neural networks, many works have been focusing on reducing the number of
floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does …
floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does …
Efficientvit: Memory efficient vision transformer with cascaded group attention
Vision transformers have shown great success due to their high model capabilities.
However, their remarkable performance is accompanied by heavy computation costs, which …
However, their remarkable performance is accompanied by heavy computation costs, which …
Rethinking vision transformers for mobilenet size and speed
With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to
optimize the performance and complexity of ViTs to enable efficient deployment on mobile …
optimize the performance and complexity of ViTs to enable efficient deployment on mobile …
Repvit: Revisiting mobile cnn from vit perspective
Abstract Recently lightweight Vision Transformers (ViTs) demonstrate superior performance
and lower latency compared with lightweight Convolutional Neural Networks (CNNs) on …
and lower latency compared with lightweight Convolutional Neural Networks (CNNs) on …
Spike-driven transformer
Abstract Spiking Neural Networks (SNNs) provide an energy-efficient deep learning option
due to their unique spike-based event-driven (ie, spike-driven) paradigm. In this paper, we …
due to their unique spike-based event-driven (ie, spike-driven) paradigm. In this paper, we …
Snapfusion: Text-to-image diffusion model on mobile devices within two seconds
Text-to-image diffusion models can create stunning images from natural language
descriptions that rival the work of professional artists and photographers. However, these …
descriptions that rival the work of professional artists and photographers. However, these …
FastViT: A fast hybrid vision transformer using structural reparameterization
The recent amalgamation of transformer and convolutional designs has led to steady
improvements in accuracy and efficiency of the models. In this work, we introduce FastViT, a …
improvements in accuracy and efficiency of the models. In this work, we introduce FastViT, a …
Faster segment anything: Towards lightweight sam for mobile applications
Segment anything model (SAM) is a prompt-guided vision foundation model for cutting out
the object of interest from its background. Since Meta research team released the SA project …
the object of interest from its background. Since Meta research team released the SA project …
Efficientsam: Leveraged masked image pretraining for efficient segment anything
Abstract Segment Anything Model (SAM) has emerged as a powerful tool for numerous
vision applications. A key component that drives the impressive performance for zero-shot …
vision applications. A key component that drives the impressive performance for zero-shot …