A meta-summary of challenges in building products with ml components–collecting experiences from 4758+ practitioners

N Nahar, H Zhang, G Lewis, S Zhou… - 2023 IEEE/ACM 2nd …, 2023 - ieeexplore.ieee.org
Incorporating machine learning (ML) components into software products raises new
software-engineering challenges and exacerbates existing ones. Many researchers have …

On the design of ai-powered code assistants for notebooks

AM McNutt, C Wang, RA Deline… - Proceedings of the 2023 …, 2023 - dl.acm.org
AI-powered code assistants, such as Copilot, are quickly becoming a ubiquitous component
of contemporary coding contexts. Among these environments, computational notebooks …

Improving steering and verification in AI-assisted data analysis with interactive task decomposition

M Kazemitabaar, J Williams, I Drosos… - Proceedings of the 37th …, 2024 - dl.acm.org
LLM-powered tools like ChatGPT Data Analysis, have the potential to help users tackle the
challenging task of data analysis programming, which requires expertise in data processing …

Dead or alive: Continuous data profiling for interactive data science

W Epperson, V Gorantla, D Moritz… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Profiling data by plotting distributions and analyzing summary statistics is a critical step
throughout data analysis. Currently, this process is manual and tedious since analysts must …

Enhancing comprehension and navigation in Jupyter notebooks with static analysis

APS Venkatesh, J Wang, L Li… - 2023 IEEE international …, 2023 - ieeexplore.ieee.org
Jupyter notebooks enable developers to interleave code snippets with rich-text and in-line
visualizations. Data scientists use Jupyter notebook as the de-facto standard for creating …

DistilKaggle: A distilled dataset of kaggle jupyter notebooks

M Mostafavi Ghahfarokhi, A Asgari… - Proceedings of the 21st …, 2024 - dl.acm.org
Jupyter notebooks have become indispensable tools for data analysis and processing in
various domains. However, despite their widespread use, there is a notable research gap in …

Assessing the Use of AutoML for Data-Driven Software Engineering

F Calefato, L Quaranta, F Lanubile… - 2023 ACM/IEEE …, 2023 - ieeexplore.ieee.org
Background. Due to the widespread adoption of Artificial Intelligence (AI) and Machine
Learning (ML) for building software applications, companies are struggling to recruit …

Static analysis driven enhancements for comprehension in machine learning notebooks

APS Venkatesh, S Sabu, M Chekkapalli… - Empirical Software …, 2024 - Springer
Jupyter notebooks have emerged as the predominant tool for data scientists to develop and
share machine learning solutions, primarily using Python as the programming language …

VulNet: Towards improving vulnerability management in the Maven ecosystem

Z Ma, S Mondal, TH Chen, H Zhang… - Empirical Software …, 2024 - Springer
Developers rely on software ecosystems such as Maven to manage and reuse external
libraries (ie, dependencies). Due to the complexity of the used dependencies, developers …

Covamat: Functionality for variety reuse through a supporting tool

L Osycka, A Cechich, A Buccella, A Montenegro… - Conference on Cloud …, 2023 - Springer
Abstract Developing reusable Big Data Systems (BDSs) implies dealing with modeling
variety as reusable assets. Conceptually speaking, these assets might be similar to reusable …