Deja vu: Contextual sparsity for efficient llms at inference time
Large language models (LLMs) with hundreds of billions of parameters have sparked a new
wave of exciting AI applications. However, they are computationally expensive at inference …
wave of exciting AI applications. However, they are computationally expensive at inference …
Neural cognitive diagnosis for intelligent education systems
Cognitive diagnosis is a fundamental issue in intelligent education, which aims to discover
the proficiency level of students on specific knowledge concepts. Existing approaches …
the proficiency level of students on specific knowledge concepts. Existing approaches …
NeuralCD: a general framework for cognitive diagnosis
Cognitive diagnosis is widely applicable in the scenarios where users' cognitive states need
to be assessed, such as games and clinical measurement. Especially in intelligent …
to be assessed, such as games and clinical measurement. Especially in intelligent …
Deep reinforcement learning for load-balancing aware network control in IoT edge systems
Load balancing is directly associated with the overall performance of a parallel and
distributed computing system. Although the relevant problems in communication and …
distributed computing system. Although the relevant problems in communication and …
Breaking the linear iteration cost barrier for some well-known conditional gradient methods using maxip data-structures
Conditional gradient methods (CGM) are widely used in modern machine learning. CGM's
overall running time usually consists of two parts: the number of iterations and the cost of …
overall running time usually consists of two parts: the number of iterations and the cost of …
Query-aware quantization for maximum inner product search
Abstract Maximum Inner Product Search (MIPS) plays an essential role in many applications
ranging from information retrieval, recommender systems to natural language processing …
ranging from information retrieval, recommender systems to natural language processing …
[HTML][HTML] CARMEL: Capturing spatio-temporal correlations via time-series sub-window imaging for home appliance classification
B Bertalanič, C Fortuna - Engineering Applications of Artificial Intelligence, 2024 - Elsevier
Energy management systems (EMS), as enablers of more efficient energy consumption,
monitor and manage appliances to help residents be more energy efficient and thus more …
monitor and manage appliances to help residents be more energy efficient and thus more …
Convrnn-t: Convolutional augmented recurrent neural network transducers for streaming speech recognition
M Radfar, R Barnwal, RV Swaminathan… - arXiv preprint arXiv …, 2022 - arxiv.org
The recurrent neural network transducer (RNN-T) is a prominent streaming end-to-end
(E2E) ASR technology. In RNN-T, the acoustic encoder commonly consists of stacks of …
(E2E) ASR technology. In RNN-T, the acoustic encoder commonly consists of stacks of …
[PDF][PDF] A pre-trained deep learning model for fast online prediction of structural seismic responses
WJ Tang, DS Wang, HB Huang, JC Dai… - International Journal of …, 2023 - researchgate.net
Deep learning techniques have gradually attracted considerable research interest in
numerous application scenarios because of their capacity to simplify and accelerate …
numerous application scenarios because of their capacity to simplify and accelerate …
A fast sampling algorithm for maximum inner product search
Abstract Maximum Inner Product Search (MIPS) has been recognized as an important
operation for the inference phase of many machine learning algorithms, including matrix …
operation for the inference phase of many machine learning algorithms, including matrix …