{CXL-ANNS}:{Software-Hardware} collaborative memory disaggregation and computation for {Billion-Scale} approximate nearest neighbor search

J Jang, H Choi, H Bae, S Lee, M Kwon… - 2023 USENIX Annual …, 2023 - usenix.org
We propose CXL-ANNS, a software-hardware collaborative approach to enable highly
scalable approximate nearest neighbor search (ANNS) services. To this end, we first …

FLASH: Fast model adaptation in ML-centric cloud platforms

H Qiu, W Mao, A Patke, S Cui, C Wang… - Proceedings of …, 2024 - proceedings.mlsys.org
The emergence of ML in various cloud system management tasks (eg, workload autoscaling
and job scheduling) has become a core driver of ML-centric cloud platforms. However, there …

Baleen:{ML} Admission & Prefetching for Flash Caches

DLK Wong, H Wu, C Molder, S Gunasekar… - … USENIX Conference on …, 2024 - usenix.org
Flash caches are used to reduce peak backend load for throughput-constrained data center
services, reducing the total number of backend servers required. Bulk storage systems are a …

Cori: Dancing to the right beat of periodic data movements over hybrid memory systems

TD Doudali, D Zahka… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
Emerging hybrid memory systems that comprise technologies such as Intel's Optane DC
Persistent Memory, exhibit disparities in the access speeds and capacity ratios of their …

On Modular Learning of Distributed Systems for Predicting {End-to-End} Latency

CJM Liang, Z Fang, Y Xie, F Yang, ZL Li… - … USENIX Symposium on …, 2023 - usenix.org
An emerging trend in cloud deployments is to adopt machine learning (ML) models to
characterize end-to-end system performance. Despite early success, such methods can …

BOAT: A bayesian optimization automl time-series framework for industrial applications

JJ Kurian, M Dix, I Amihai, G Ceusters… - 2021 IEEE Seventh …, 2021 - ieeexplore.ieee.org
Driven by the increasing degree of automation, industrial production plants have become
very data reliant, which poses a great potential for machine learning applications. AutoML is …

Phronesis: Efficient performance modeling for high-dimensional configuration tuning

Y Li, BC Lee - ACM Transactions on Architecture and Code …, 2022 - dl.acm.org
We present Phronesis, a learning framework for efficiently modeling the performance of data
analytic workloads as a function of their high-dimensional software configuration …

Herti: A reinforcement learning-augmented system for efficient real-time inference on heterogeneous embedded systems

M Han, W Baek - 2021 30th International Conference on …, 2021 - ieeexplore.ieee.org
Real-time inference is the key technology that enables a variety of latency-critical intelligent
services such as autonomous driving and augmented reality. Heterogeneous embedded …

FSbrain: An intelligent I/O performance tuning system

Y Tang, R Lin, D Li, Y Li, D Zeng - Journal of Systems Architecture, 2022 - Elsevier
To bridge the performance gap between network and storage, performance tuning of file
systems according to different workloads has become an important work on the Internet of …

Bridging Software-Hardware for CXL Memory Disaggregation in Billion-Scale Nearest Neighbor Search

J Jang, H Choi, H Bae, S Lee, M Kwon… - ACM Transactions on …, 2024 - dl.acm.org
We propose CXL-ANNS, a software-hardware collaborative approach to enable scalable
approximate nearest neighbor search (ANNS) services. To this end, we first disaggregate …