Towards general and efficient online tuning for spark

Y Li, H Jiang, Y Shen, Y Fang, X Yang, D Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
The distributed data analytic system--Spark is a common choice for processing massive
volumes of heterogeneous data, while it is challenging to tune its parameters to achieve …

Openbox: A Python toolkit for generalized black-box optimization

H Jiang, Y Shen, Y Li, B Xu, S Du, W Zhang… - Journal of Machine …, 2024 - jmlr.org
Black-box optimization (BBO) has a broad range of applications, including automatic
machine learning, experimental design, and database knob tuning. However, users still face …

ByteCard: Enhancing Data Warehousing with Learned Cardinality Estimation

Y Han, H Wang, L Chen, Y Dong, X Chen, B Yu… - arXiv preprint arXiv …, 2024 - arxiv.org
Cardinality estimation is a critical component and a longstanding challenge in modern data
warehouses. ByteHouse, ByteDance's cloud-native engine for big data analysis in exabyte …

QHB+: Accelerated Configuration Optimization for Automated Performance Tuning of Spark SQL Applications

D Jang, H Yoon, K Jung, YD Chung - IEEE Access, 2024 - ieeexplore.ieee.org
Apache Spark stands out as a well-known solution for big data processing because of its
efficiency and rapid processing capabilities. One of its modules, Spark SQL, serves as a …

Avoiding Materialisation for Guarded Aggregate Queries

M Lanzinger, R Pichler, A Selzer - arXiv preprint arXiv:2406.17076, 2024 - arxiv.org
Database systems are often confronted with queries that join many tables but ultimately only
output comparatively small aggregate information. Despite all advances in query …

EMIT: Micro-Invasive Database Configuration Tuning

J Geng, H Wang, Y Yan - arXiv preprint arXiv:2406.00616, 2024 - arxiv.org
The process of database knob tuning has always been a challenging task. Recently,
database knob tuning methods has emerged as a promising solution to mitigate these …

ByteCard: Enhancing ByteDance's Data Warehouse with Learned Cardinality Estimation

Y Han, H Wang, L Chen, Y Dong, X Chen… - Companion of the 2024 …, 2024 - dl.acm.org
Cardinality estimation is a critical component and a longstanding challenge in modern data
warehouses. ByteHouse, ByteDance's cloud-native engine for extensive data analysis in …