Gshard: Scaling giant models with conditional computation and automatic sharding
Neural network scaling has been critical for improving the model quality in many real-world
machine learning applications with vast amounts of training data and compute. Although this …
machine learning applications with vast amounts of training data and compute. Although this …
Reversible vision transformers
Abstract We present Reversible Vision Transformers, a memory efficient architecture design
for visual recognition. By decoupling the GPU memory footprint from the depth of the model …
for visual recognition. By decoupling the GPU memory footprint from the depth of the model …
NetKet 3: Machine learning toolbox for many-body quantum systems
We introduce version 3 of NetKet, the machine learning toolbox for many-body quantum
physics. NetKet is built around neural-network quantum states and provides efficient …
physics. NetKet is built around neural-network quantum states and provides efficient …
Analog photonics computing for information processing, inference, and optimization
N Stroev, NG Berloff - Advanced Quantum Technologies, 2023 - Wiley Online Library
This review presents an overview of the current state‐of‐the‐art in photonics computing,
which leverages photons, photons coupled with matter, and optics‐related technologies for …
which leverages photons, photons coupled with matter, and optics‐related technologies for …
[HTML][HTML] Multilayer reflective coatings for BEUV lithography: A review
The development of microelectronics is always driven by reducing transistor size and
increasing integration, from the initial micron-scale to the current few nanometers. The …
increasing integration, from the initial micron-scale to the current few nanometers. The …
Cost-efficient overclocking in immersion-cooled datacenters
Cloud providers typically use air-based solutions for cooling servers in datacenters.
However, increasing transistor counts and the end of Dennard scaling will result in chips …
However, increasing transistor counts and the end of Dennard scaling will result in chips …
A survey of multi-tenant deep learning inference on gpu
Deep Learning (DL) models have achieved superior performance. Meanwhile, computing
hardware like NVIDIA GPUs also demonstrated strong computing scaling trends with 2x …
hardware like NVIDIA GPUs also demonstrated strong computing scaling trends with 2x …
[HTML][HTML] Size dependent transport of floating plastics modeled in the global ocean
D Klink, A Peytavin, L Lebreton - Frontiers in Marine Science, 2022 - frontiersin.org
Plastic has been detected in the ocean in most locations where scientists have looked for it.
While ubiquitous in the environment, plastic pollution is heterogeneous, and plastics of …
While ubiquitous in the environment, plastic pollution is heterogeneous, and plastics of …
Integrated microwave photonics coherent processor for massive-MIMO systems in wireless communications
PMC Romero, JR Rausell-Campo… - IEEE Journal of …, 2023 - ieeexplore.ieee.org
Massive-MIMO systems can achieve high capacities and data rates by increasing the
number of operational antennas in the base station. As more antenna elements are …
number of operational antennas in the base station. As more antenna elements are …
Acoustic and plasma sensing of laser ablation via deep learning
Monitoring laser ablation when using high power lasers can be challenging due to plasma
obscuring the view of the machined sample. Whilst the appearance of the generated plasma …
obscuring the view of the machined sample. Whilst the appearance of the generated plasma …