Observing the clouds: a survey and taxonomy of cloud monitoring

JS Ward, A Barker - Journal of Cloud Computing, 2014 - Springer
Monitoring is an important aspect of designing and maintaining large-scale systems. Cloud
computing presents a unique set of challenges to monitoring including: on-demand …

Seer: Leveraging big data to navigate the complexity of performance debugging in cloud microservices

Y Gan, Y Zhang, K Hu, D Cheng, Y He… - Proceedings of the …, 2019 - dl.acm.org
Performance unpredictability is a major roadblock towards cloud adoption, and has
performance, cost, and revenue ramifications. Predictable performance is even more critical …

On the performance variability of production cloud services

A Iosup, N Yigitbasi, D Epema - 2011 11th IEEE/ACM …, 2011 - ieeexplore.ieee.org
Cloud computing is an emerging infrastructure paradigm that promises to eliminate the need
for companies to maintain expensive computing hardware. Through the use of virtualization …

The cat is out of the bag: cortical simulations with 109 neurons, 1013 synapses

R Ananthanarayanan, SK Esser, HD Simon… - Proceedings of the …, 2009 - dl.acm.org
In the quest for cognitive computing, we have built a massively parallel cortical simulator,
C2, that incorporates a number of innovations in computation, memory, and communication …

Hcloud: Resource-efficient provisioning in shared cloud systems

C Delimitrou, C Kozyrakis - Proceedings of the Twenty-First International …, 2016 - dl.acm.org
Cloud computing promises flexibility and high performance for users and cost efficiency for
operators. To achieve this, cloud providers offer instances of different sizes, both as long …

More for your money: exploiting performance heterogeneity in public clouds

B Farley, A Juels, V Varadarajan, T Ristenpart… - Proceedings of the …, 2012 - dl.acm.org
Infrastructure-as-a-system compute clouds such as Amazon's EC2 allow users to pay a flat
hourly rate to run their virtual machine (VM) on a server providing some combination of CPU …

Bolt: I know what you did last summer... in the cloud

C Delimitrou, C Kozyrakis - ACM SIGARCH Computer Architecture News, 2017 - dl.acm.org
Cloud providers routinely schedule multiple applications per physical host to increase
efficiency. The resulting interference on shared resources often leads to performance …

A provenance-based adaptive scheduling heuristic for parallel scientific workflows in clouds

D de Oliveira, KACS Ocaña, F Baião… - Journal of grid …, 2012 - Springer
In the last years, scientific workflows have emerged as a fundamental abstraction for
structuring and executing scientific experiments in computational environments. Scientific …

Topology-aware gpu scheduling for learning workloads in cloud environments

M Amaral, J Polo, D Carrera, S Seelam… - Proceedings of the …, 2017 - dl.acm.org
Recent advances in hardware, such as systems with multiple GPUs and their availability in
the cloud, are enabling deep learning in various domains including health care …

A comparative study of high-performance computing on the cloud

A Marathe, R Harris, DK Lowenthal… - Proceedings of the …, 2013 - dl.acm.org
The popularity of Amazon's EC2 cloud platform has increased in recent years. However,
many high-performance computing (HPC) users consider dedicated high-performance …