Why does the cloud stop computing? lessons from hundreds of service outages
We conducted a cloud outage study (COS) of 32 popular Internet services. We analyzed
1247 headline news and public post-mortem reports that detail 597 unplanned outages that …
1247 headline news and public post-mortem reports that detail 597 unplanned outages that …
An empirical study on configuration errors in commercial and open source systems
Z Yin, X Ma, J Zheng, Y Zhou… - Proceedings of the …, 2011 - dl.acm.org
Configuration errors (ie, misconfigurations) are among the dominant causes of system
failures. Their importance has inspired many research efforts on detecting, diagnosing, and …
failures. Their importance has inspired many research efforts on detecting, diagnosing, and …
The attack of the clones: A study of the impact of shared code on vulnerability patching
Vulnerability exploits remain an important mechanism for malware delivery, despite efforts to
speed up the creation of patches and improvements in software updating mechanisms …
speed up the creation of patches and improvements in software updating mechanisms …
KATCH: High-coverage testing of software patches
PD Marinescu, C Cadar - Proceedings of the 2013 9th Joint Meeting on …, 2013 - dl.acm.org
One of the distinguishing characteristics of software systems is that they evolve: new patches
are committed to software repositories and new versions are released to users on a …
are committed to software repositories and new versions are released to users on a …
Cloud software upgrades: Challenges and opportunities
I Neamtiu, T Dumitraş - … on the Maintenance and Evolution of …, 2011 - ieeexplore.ieee.org
The fast evolution pace for cloud computing software is on a collision course with our
growing reliance on cloud computing. On one hand, cloud software must have the agility to …
growing reliance on cloud computing. On one hand, cloud software must have the agility to …
Automatic error elimination by horizontal code transfer across multiple applications
S Sidiroglou-Douskos, E Lahtinen, F Long… - Proceedings of the 36th …, 2015 - dl.acm.org
We present Code Phage (CP), a system for automatically transferring correct code from
donor applications into recipient applications that process the same inputs to successfully …
donor applications into recipient applications that process the same inputs to successfully …
[PDF][PDF] Inter-disciplinary research challenges in computer systems for the 2020s
The broad landscape of new technologies currently being explored makes the current times
very exciting for computer systems research. The community is actively researching an …
very exciting for computer systems research. The community is actively researching an …
Experience report: Anomaly detection of cloud application operations using log and cloud metric correlation analysis
Failure of application operations is one of the main causes of system-wide outages in cloud
environments. This particularly applies to DevOps operations, such as backup …
environments. This particularly applies to DevOps operations, such as backup …
Safe software updates via multi-version execution
Software systems are constantly evolving, with new versions and patches being released on
a continuous basis. Unfortunately, software updates present a high risk, with many releases …
a continuous basis. Unfortunately, software updates present a high risk, with many releases …
Keepers of the machines: Examining how system administrators manage software updates for multiple machines
Keeping machines updated is crucial for maintaining system security. While recent studies
have investigated the software updating practices of end users, system administrators have …
have investigated the software updating practices of end users, system administrators have …