On the design of ai-powered code assistants for notebooks

AM McNutt, C Wang, RA Deline… - Proceedings of the 2023 …, 2023 - dl.acm.org
AI-powered code assistants, such as Copilot, are quickly becoming a ubiquitous component
of contemporary coding contexts. Among these environments, computational notebooks …

Notable: On-the-fly assistant for data storytelling in computational notebooks

H Li, L Ying, H Zhang, Y Wu, H Qu… - Proceedings of the 2023 …, 2023 - dl.acm.org
Computational notebooks are widely used for data analysis. Their interleaved displays of
code and execution results (eg, visualizations) are welcomed since they enable iterative …

Causalvis: Visualizations for causal inference

G Guo, E Karavani, A Endert, BC Kwon - … of the 2023 CHI conference on …, 2023 - dl.acm.org
Causal inference is a statistical paradigm for quantifying causal effects using observational
data. It is a complex process, requiring multiple steps, iterations, and collaborations with …

Dead or alive: Continuous data profiling for interactive data science

W Epperson, V Gorantla, D Moritz… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Profiling data by plotting distributions and analyzing summary statistics is a critical step
throughout data analysis. Currently, this process is manual and tedious since analysts must …

Waitgpt: Monitoring and steering conversational llm agent in data analysis with on-the-fly code visualization

L Xie, C Zheng, H Xia, H Qu, C Zhu-Tian - Proceedings of the 37th …, 2024 - dl.acm.org
Large language models (LLMs) support data analysis through conversational user
interfaces, as exemplified in OpenAI's ChatGPT (formally known as Advanced Data Analysis …

How Do Analysts Understand and Verify AI-Assisted Data Analyses?

K Gu, R Shang, T Althoff, C Wang… - Proceedings of the CHI …, 2024 - dl.acm.org
Data analysis is challenging as it requires synthesizing domain knowledge, statistical
expertise, and programming skills. Assistants powered by large language models (LLMs) …

ydata-profiling: Accelerating data-centric AI with high-quality data

F Clemente, GM Ribeiro, A Quemy, MS Santos… - Neurocomputing, 2023 - Elsevier
Abstract ydata-profiling is an open-source Python package for advanced exploratory data
analysis that enables users to generate data profiling reports in a simple, fast, and efficient …

Colaroid: A literate programming approach for authoring explorable multi-stage tutorials

AY Wang, A Head, AG Zhang, S Oney… - Proceedings of the 2023 …, 2023 - dl.acm.org
Multi-stage programming tutorials are key learning resources for programmers, using
progressive incremental steps to teach them how to build larger software systems. A good …

Aspirations and practice of ml model documentation: Moving the needle with nudging and traceability

A Bhat, A Coursey, G Hu, S Li, N Nahar… - Proceedings of the …, 2023 - dl.acm.org
The documentation practice for machine-learned (ML) models often falls short of established
practices for traditional software, which impedes model accountability and inadvertently …

How data analysts use a visualization grammar in practice

X Pu, M Kay - Proceedings of the 2023 CHI Conference on Human …, 2023 - dl.acm.org
Visualization grammars, often based on the Grammar of Graphics (GoG), have much
potential for augmenting data analysis in a programming environment. However, we do not …