SystemDS: A declarative machine learning system for the end-to-end data science lifecycle
M Boehm, I Antonov, S Baunsgaard, M Dokter… - arXiv preprint arXiv …, 2019 - arxiv.org
Machine learning (ML) applications become increasingly common in many domains. ML
systems to execute these workloads include numerical computing frameworks and libraries …
systems to execute these workloads include numerical computing frameworks and libraries …
How to architect a query compiler, revisited
To leverage modern hardware platforms to their fullest, more and more database systems
embrace compilation of query plans to native code. In the research community, there is an …
embrace compilation of query plans to native code. In the research community, there is an …
Babelfish: Efficient execution of polyglot queries
Today's users of data processing systems come from different domains, have different levels
of expertise, and prefer different programming languages. As a result, analytical workload …
of expertise, and prefer different programming languages. As a result, analytical workload …
HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines
P Chrysogelos, M Karpathiotakis… - Proceedings of the …, 2019 - infoscience.epfl.ch
Modern server hardware is increasingly heterogeneous as hardware accelerators, such as
GPUs, are used together with multicore CPUs to meet the computational demands of …
GPUs, are used together with multicore CPUs to meet the computational demands of …
Filter before you parse: Faster analytics on raw data with sparser
Exploratory big data applications often run on raw unstructured or semi-structured data
formats, such as JSON files or text logs. These applications can spend 80--90% of their …
formats, such as JSON files or text logs. These applications can spend 80--90% of their …
JSON tiles: Fast analytics on semi-structured data
Developers often prefer flexibility over upfront schema design, making semi-structured data
formats such as JSON increasingly popular. Large amounts of JSON data are therefore …
formats such as JSON increasingly popular. Large amounts of JSON data are therefore …
On optimizing operator fusion plans for large-scale machine learning in systemml
Many large-scale machine learning (ML) systems allow specifying custom ML algorithms by
means of linear algebra programs, and then automatically generate efficient execution …
means of linear algebra programs, and then automatically generate efficient execution …
[PDF][PDF] The case for heterogeneous HTAP
R Appuswamy, M Karpathiotakis… - … on Innovative Data …, 2017 - infoscience.epfl.ch
Modern database engines balance the demanding requirements of mixed, hybrid
transactional and analytical processing (HTAP) workloads by relying on i) global shared …
transactional and analytical processing (HTAP) workloads by relying on i) global shared …
Adaptive partitioning and indexing for in situ query processing
The constant flux of data and queries alike has been pushing the boundaries of data
analysis systems. The increasing size of raw data files has made data loading an expensive …
analysis systems. The increasing size of raw data files has made data loading an expensive …
Slalom: Coasting through raw data via adaptive partitioning and indexing
The constant flux of data and queries alike has been pushing the boundaries of data
analysis systems. The increasing size of raw data files has made data loading an expensive …
analysis systems. The increasing size of raw data files has made data loading an expensive …