{AlpaServe}: Statistical multiplexing with model parallelism for deep learning serving
Model parallelism is conventionally viewed as a method to scale a single large deep
learning model beyond the memory limits of a single device. In this paper, we demonstrate …
learning model beyond the memory limits of a single device. In this paper, we demonstrate …
Optimizing the cloud? Don't train models. Build oracles!
We propose cloud oracles, an alternative to machine learning for online optimization of
cloud configurations. Our cloud oracle approach guarantees complete accuracy and …
cloud configurations. Our cloud oracle approach guarantees complete accuracy and …
Zeal: Rethinking Large-Scale Resource Allocation with" Decouple and Decompose"
Resource allocation is fundamental for cloud systems to ensure efficient resource sharing
among tenants. However, the scale of such optimization problems has outgrown the …
among tenants. However, the scale of such optimization problems has outgrown the …
SkyPIE: A Fast & Accurate Oracle for Object Placement
Cloud object stores offer vastly different price points for object storage as a function of
workload and geography. Poor object placement can thus lead to significant cost overheads …
workload and geography. Poor object placement can thus lead to significant cost overheads …
Demystifying Data Management for Large Language Models
Navigating the intricacies of data management in the era of Large Language Models (LLMs)
presents both challenges and opportunities for database and data management …
presents both challenges and opportunities for database and data management …
Модели смешанного целочисленного линейного программирования оптимизации включения заданий в пакеты и порядков проведения операций с ними в …
КВ Кротов - Информационно-управляющие системы, 2024 - i-us.ru
Аннотация Введение: идентификация эффективных расписаний выполнения в
конвейерных системах наборов однотипных заданий предусматривает поиск лучших …
конвейерных системах наборов однотипных заданий предусматривает поиск лучших …
Empowering Large Language Models with Efficient and Automated Systems
Z Li - 2024 - search.proquest.com
Abstract Large Language Models (LLMs) have shown remarkable capabilities in a variety of
tasks, including chatting, programming, and searching. However, the high costs of LLMs are …
tasks, including chatting, programming, and searching. However, the high costs of LLMs are …
Timely and Efficient Resource Management in Networked Systems
Z Xu - 2024 - search.proquest.com
Resource management is ubiquitous in networked systems and ensures the effective
sharing of resources among various demands. Examples include traffic engineering, cluster …
sharing of resources among various demands. Examples include traffic engineering, cluster …
[图书][B] Abstractions for Scaling Stateful Cloud Applications
P Kraft - 2023 - search.proquest.com
As the scale of both computing and data grows, developers are increasingly building
distributed statefulsystems in the cloud. However, these systems are challenging to build at …
distributed statefulsystems in the cloud. However, these systems are challenging to build at …