Scalpel: The python static analysis framework

X Wang, Y Wang, Y Wan, J Wang, P Zhou, L Li… - arXiv preprint arXiv …, 2022 - arxiv.org

Recent years have witnessed increasing interest in code representation learning, which
aims to represent the semantics of source code into distributed vectors. Currently, various …

被引用次数：35 相关文章所有 6 个版本

[PDF] arxiv.org

Models are codes: Towards measuring malicious code poisoning attacks on pre-trained model hubs

J Zhao, S Wang, Y Zhao, X Hou, K Wang… - 2024 39th IEEE/ACM …, 2024 - ieeexplore.ieee.org

The proliferation of pre-trained models (PTMs) and datasets has led to the emergence of
centralized model hubs like Hugging Face, which facilitate collaborative development and …

被引用次数：5 相关文章所有 6 个版本

[PDF] arxiv.org

Enhancing comprehension and navigation in Jupyter notebooks with static analysis

APS Venkatesh, J Wang, L Li… - 2023 IEEE international …, 2023 - ieeexplore.ieee.org

Jupyter notebooks enable developers to interleave code snippets with rich-text and in-line
visualizations. Data scientists use Jupyter notebook as the de-facto standard for creating …

被引用次数：14 相关文章所有 6 个版本

[PDF] arxiv.org

Peatmoss: A dataset and initial analysis of pre-trained models in open-source software

W Jiang, J Yasmin, J Jones, N Synovic… - 2024 IEEE/ACM 21st …, 2024 - ieeexplore.ieee.org

The development and training of deep learning models have become increasingly costly
and complex. Consequently, software engineers are adopting pre-trained models (PTMs) for …

被引用次数：14 相关文章所有 7 个版本

[PDF] acm.org

Data leakage in notebooks: Static detection and better processes

C Yang, RA Brower-Sinning, G Lewis… - Proceedings of the 37th …, 2022 - dl.acm.org

Data science pipelines to train and evaluate models with machine learning may contain
bugs just like any other code. Leakage between training and test data can lead to …

被引用次数：13 相关文章所有 6 个版本

[PDF] springer.com

Static analysis driven enhancements for comprehension in machine learning notebooks

APS Venkatesh, S Sabu, M Chekkapalli… - Empirical Software …, 2024 - Springer

Jupyter notebooks have emerged as the predominant tool for data scientists to develop and
share machine learning solutions, primarily using Python as the programming language …

被引用次数：1 相关文章所有 2 个版本

[PDF] shinhwei.com

Investigating and Detecting Silent Bugs in PyTorch Programs

S Hong, H Sun, X Gao, SH Tan - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Deep Learning (DL) has been widely applied in various fields. Unlike traditional software,
DL programs possess the “black box” characteristic that can make it challenging for …

被引用次数：2 相关文章所有 2 个版本

[PDF] webis.de

Exploring Hyperparameter Usage and Tuning in Machine Learning Research

S Simon, N Kolyada, C Akiki, M Potthast… - 2023 IEEE/ACM 2nd …, 2023 - ieeexplore.ieee.org

The success of machine learning (ML) models depends on careful experimentation and
optimization of their hyperparameters. Tuning can affect the reliability and accuracy of a …

被引用次数：10 相关文章所有 6 个版本

[PDF] acm.org

Hard to Read and Understand Pythonic Idioms? DeIdiom and Explain Them in Non-Idiomatic Equivalent Code

Z Zhang, Z Xing, D Zhao, Q Lu, X Xu… - Proceedings of the IEEE …, 2024 - dl.acm.org

The Python community strives to design pythonic idioms so that Python users can achieve
their intent in a more concise and efficient way. According to our analysis of 154 questions …

被引用次数：3 相关文章所有 4 个版本

[PDF] hirzels.com

Complex Python features in the wild

Y Yang, A Milanova, M Hirzel - … of the 19th International Conference on …, 2022 - dl.acm.org

While Python is increasingly popular, program analysis tooling for Python is lagging. This is
due, in part, to complex features of the Python language---features with difficult to …

被引用次数：9 相关文章所有 6 个版本