Baldur: Whole-proof generation and repair with large language models
Formally verifying software is a highly desirable but labor-intensive task. Recent work has
developed methods to automate formal verification using proof assistants, such as Coq and …
developed methods to automate formal verification using proof assistants, such as Coq and …
Fairness testing: testing software for discrimination
This paper defines software fairness and discrimination and develops a testing-based
method for measuring if and how much software discriminates, focusing on causality in …
method for measuring if and how much software discriminates, focusing on causality in …
Debugging inputs
L Kirschner, E Soremekun, A Zeller - Proceedings of the ACM/IEEE 42nd …, 2020 - dl.acm.org
When a program fails to process an input, it need not be the program code that is at fault. It
can also be that the input data is faulty, for instance as result of data corruption. To get the …
can also be that the input data is faulty, for instance as result of data corruption. To get the …
Trade-offs in continuous integration: assurance, security, and flexibility
Continuous integration (CI) systems automate the compilation, building, and testing of
software. Despite CI being a widely used activity in software engineering, we do not know …
software. Despite CI being a widely used activity in software engineering, we do not know …
Software fairness
A goal of software engineering research is advancing software quality and the success of
the software engineering process. However, while recent studies have demonstrated a new …
the software engineering process. However, while recent studies have demonstrated a new …
Surfacing visualization mirages
Dirty data and deceptive design practices can undermine, invert, or invalidate the purported
messages of charts and graphs. These failures can arise silently: a conclusion derived from …
messages of charts and graphs. These failures can arise silently: a conclusion derived from …
A large-scale longitudinal study of flaky tests
Flaky tests are tests that can non-deterministically pass or fail for the same code version.
These tests undermine regression testing efficiency, because developers cannot easily …
These tests undermine regression testing efficiency, because developers cannot easily …
Themis: Automatically testing software for discrimination
Bias in decisions made by modern software is becoming a common and serious problem.
We present Themis, an automated test suite generator to measure two types of …
We present Themis, an automated test suite generator to measure two types of …
Vizlinter: A linter and fixer framework for data visualization
Despite the rising popularity of automated visualization tools, existing systems tend to
provide direct results which do not always fit the input data or meet visualization …
provide direct results which do not always fit the input data or meet visualization …
Quality of automated program repair on real-world defects
Automated program repair is a promising approach to reducing the costs of manual
debugging and increasing software quality. However, recent studies have shown that …
debugging and increasing software quality. However, recent studies have shown that …