Recache: Reactive caching for fast analytics over heterogeneous data

D Durner, V Leis, T Neumann - … of the 2021 International Conference on …, 2021 - dl.acm.org

Developers often prefer flexibility over upfront schema design, making semi-structured data
formats such as JSON increasingly popular. Large amounts of JSON data are therefore …

被引用次数：40 相关文章所有 10 个版本

[PDF] vldb.org

Efficient streaming subgraph isomorphism with graph neural networks

CT Duong, TD Hoang, H Yin, M Weidlich… - Proceedings of the …, 2021 - dl.acm.org

Queries to detect isomorphic subgraphs are important in graph-based data management.
While the problem of subgraph isomorphism search has received considerable attention for …

被引用次数：35 相关文章所有 6 个版本

[PDF] academia.edu

Gridformation: Towards self-driven online data partitioning using reinforcement learning

GC Durand, M Pinnecke, R Piriyev, M Mohsen… - Proceedings of the First …, 2018 - dl.acm.org

In this paper we define a research agenda to develop a general framework supporting
online autonomous tuning of data partitioning and layouts with a reinforcement learning …

被引用次数：38 相关文章所有 6 个版本

[PDF] vldb.org

Accelerating raw data analysis with the accorda software and hardware architecture

Y Fang, C Zou, AA Chien - Proceedings of the VLDB Endowment, 2019 - dl.acm.org

The data science revolution and growing popularity of data lakes make efficient processing
of raw data increasingly important. To address this, we propose the ACCelerated Operators …

被引用次数：29 相关文章所有 4 个版本

[PDF] researchgate.net

Resource monitoring framework for big raw data processing

M Patel, M Bhise - International Journal of Big Data …, 2024 - inderscienceonline.com

Scientific experiments, simulations, and modern applications generate large amounts of
data. Analysing resources required to process such big datasets is essential to identify …

被引用次数：3 相关文章所有 3 个版本

[PDF] vldb.org

Intermittent query processing

D Tang, Z Shang, AJ Elmore, S Krishnan… - Proceedings of the …, 2019 - dl.acm.org

Many applications ingest data in an intermittent, yet largely predictable, pattern. Existing
systems tend to ignore how data arrives when making decisions about how to update (or …

被引用次数：26 相关文章所有 5 个版本

[PDF] nsf.gov

Generating application-specific data layouts for in-memory databases

C Yan, A Cheung - Proceedings of the VLDB Endowment, 2019 - dl.acm.org

Database applications are often developed with object-oriented languages while using
relational databases as the backend. To accelerate these applications, developers would …

被引用次数：16 相关文章所有 8 个版本

[PDF] arxiv.org

ParPaRaw: Massively parallel parsing of delimiter-separated raw data

E Stehle, HA Jacobsen - arXiv preprint arXiv:1905.13415, 2019 - arxiv.org

Parsing is essential for a wide range of use cases, such as stream processing, bulk loading,
and in-situ querying of raw data. Yet, the compute-intense step often constitutes a major …

被引用次数：17 相关文章所有 5 个版本

[PDF] univr.it

In-memory caching for multi-query optimization of data-intensive scalable computing workloads

M Pietro, D Carra, S Migliorini - Proceedings of the Workshops of the …, 2019 - iris.univr.it

In modern large-scale distributed systems, analytics jobs submitted by various users often
share similar work. Instead of optimizing jobs independently, multi-query optimization …

被引用次数：15 相关文章所有 6 个版本

[PDF] upc.edu

A cost-based storage format selector for materialized results in big data frameworks

RF Munir, A Abelló, O Romero, M Thiele… - Distributed and Parallel …, 2020 - Springer

Modern big data frameworks (such as Hadoop and Spark) allow multiple users to do large-
scale analysis simultaneously, by deploying data-intensive workflows (DIWs). These DIWs of …

被引用次数：13 相关文章所有 7 个版本