Assess and summarize: Improve outage understanding with large language models

P Jin, S Zhang, M Ma, H Li, Y Kang, L Li, Y Liu… - Proceedings of the 31st …, 2023 - dl.acm.org
Cloud systems have become increasingly popular in recent years due to their flexibility and
scalability. Each time cloud computing applications and services hosted on the cloud are …

Codec: Cost-effective duration prediction system for deadline scheduling in the cloud

H Li, M Ma, Y Liu, S Qin, B Qiao, R Yao… - 2023 IEEE 34th …, 2023 - ieeexplore.ieee.org
Modern cloud platforms allow customers to flexibly allocate or release computing resources.
One crucial scenario is how to drive existing VMs to a specific state by a given deadline in a …

MonitorAssistant: Simplifying Cloud Service Monitoring via Large Language Models

Z Yu, M Ma, C Zhang, S Qin, Y Kang, C Bansal… - … Proceedings of the …, 2024 - dl.acm.org
In large-scale cloud service systems, monitoring metric data and conducting anomaly
detection is an important way to maintain reliability and stability. However, great disparity …