A software engineering perspective on engineering machine learning systems: State of the art and challenges

G Giray - Journal of Systems and Software, 2021 - Elsevier
Context: Advancements in machine learning (ML) lead to a shift from the traditional view of
software development, where algorithms are hard-coded by humans, to ML systems …

Asset Management in Machine Learning: State-of-research and State-of-practice

S Idowu, D Strüber, T Berger - ACM Computing Surveys, 2022 - dl.acm.org
Machine learning components are essential for today's software systems, causing a need to
adapt traditional software engineering practices when developing machine-learning-based …

Management of machine learning lifecycle artifacts: A survey

M Schlegel, KU Sattler - ACM SIGMOD Record, 2023 - dl.acm.org
The explorative and iterative nature of developing and operating ML applications leads to a
variety of artifacts, such as datasets, features, models, hyperparameters, metrics, software …

Garbage in, garbage out? Do machine learning application papers in social computing report where human-labeled training data comes from?

RS Geiger, K Yu, Y Yang, M Dai, J Qiu, R Tang… - Proceedings of the …, 2020 - dl.acm.org
Many machine learning projects for new application areas involve teams of humans who
label data for a particular purpose, from hiring crowdworkers to the paper's authors labeling …

Automated end-to-end management of the modeling lifecycle in deep learning

G Gharibi, V Walunj, R Nekadi, R Marri… - Empirical Software …, 2021 - Springer
Deep learning has improved the state-of-the-art results in an ever-growing number of
domains. This success heavily relies on the development and training of deep learning …

" Garbage In, Garbage Out" Revisited: What Do Machine Learning Application Papers Report About Human-Labeled Training Data?

RS Geiger, D Cope, J Ip, M Lotosh, A Shah… - arXiv preprint arXiv …, 2021 - arxiv.org
Supervised machine learning, in which models are automatically derived from labeled
training data, is only as good as the quality of that data. This study builds on prior work that …

Credibility of scientific information on social media: Variation by platform, genre and presence of formal credibility cues

C Boothby, D Murray, AP Waggy, A Tsou… - Quantitative Science …, 2021 - direct.mit.edu
Responding to calls to take a more active role in communicating their research findings,
scientists are increasingly using open online platforms, such as Twitter, to engage in science …

Orfeon: An AIOps framework for the goal-driven operationalization of distributed analytical pipelines

J Díaz-de-Arcaya, AI Torre-Bastida, R Miñón… - Future Generation …, 2023 - Elsevier
Abstract The use of Artificial Intelligence solutions keeps raising in the business domain.
However, this adoption has not brought the expected results to companies so far. There are …

MaskSearch: Querying Image Masks at Scale

D He, J Zhang, M Daum, A Ratner… - arXiv preprint arXiv …, 2023 - arxiv.org
Machine learning tasks over image databases often generate masks that annotate image
content (eg, saliency maps, segmentation maps, depth maps) and enable a variety of …

Provenance supporting hyperparameter analysis in deep neural networks

D Pina, L Kunstmann, D de Oliveira, P Valduriez… - International …, 2020 - Springer
The duration of the life cycle in deep neural networks (DNN) depends on the data
configuration decisions that lead to success in obtaining models. Analyzing …