A survey on advancing the dbms query optimizer: Cardinality estimation, cost model, and plan enumeration
Query optimizer is at the heart of the database systems. Cost-based optimizer studied in this
paper is adopted in almost all current database systems. A cost-based optimizer introduces …
paper is adopted in almost all current database systems. A cost-based optimizer introduces …
The art of balance: a RateupDB™ experience of building a CPU/GPU hybrid database product
R Lee, M Zhou, C Li, S Hu, J Teng, D Li… - Proceedings of the VLDB …, 2021 - dl.acm.org
GPU-accelerated database systems have been studied for more than 10 years, ranging from
prototyping development to industry products serving in multiple domains of data …
prototyping development to industry products serving in multiple domains of data …
Predicate pushdown for data science pipelines
Predicate pushdown is a widely adopted query optimization. Existing systems and prior work
mostly use pattern-matching rules to decide when a predicate can be pushed through …
mostly use pattern-matching rules to decide when a predicate can be pushed through …
Phoebe: a learning-based checkpoint optimizer
Easy-to-use programming interfaces paired with cloud-scale processing engines have
enabled big data system users to author arbitrarily complex analytical jobs over massive …
enabled big data system users to author arbitrarily complex analytical jobs over massive …
SlabCity: Whole-Query Optimization Using Program Synthesis
Query rewriting is often a prerequisite for effective query optimization, particularly for poorly-
written queries. Prior work on query rewriting has relied on a set of" rules" based on syntactic …
written queries. Prior work on query rewriting has relied on a set of" rules" based on syntactic …
Unshackling Database Benchmarking from Synthetic Workloads
Introducing new (learned) features into a DBMS requires considerable experimentation and
benchmarking to avoid regressions in production (customer) workloads. Using standard …
benchmarking to avoid regressions in production (customer) workloads. Using standard …
The cosmos big data platform at microsoft: Over a decade of progress and a decade to look forward
The twenty-first century has been dominated by the need for large scale data processing,
marking the birth of big data platforms such as Cosmos. This paper describes the evolution …
marking the birth of big data platforms such as Cosmos. This paper describes the evolution …
[PDF][PDF] Welding Natural Language Queries to Analytics IRs with LLMs.
From the recent momentum behind translating natural language to SQL (nl2sql), to
commercial product offerings such as Co-Pilot for Microsoft Fabric, Large Language Models …
commercial product offerings such as Co-Pilot for Microsoft Fabric, Large Language Models …
Optimizing ETL Processes for Big Data Applications
HG Kola - International Journal of Engineering and Management …, 2024 - indianjournals.com
Optimizing large-scale data processing has become crucial in the area of data management
due to the constantly growing quantity and complexity of data. Big data analysis involves …
due to the constantly growing quantity and complexity of data. Big data analysis involves …
Computation reuse via fusion in Amazon Athena
Amazon Athena is a serverless, interactive query service that allows efficiently analyzing
large volumes of data stored in Amazon S3 using ANSI SQL. Some design choices in the …
large volumes of data stored in Amazon S3 using ANSI SQL. Some design choices in the …