Execution primitives for scalable joins and aggregations in map reduce

Y Xu, N Chen, A Fernandez, O Sinno… - Proceedings of the 21th …, 2015 - dl.acm.org

A/B testing, also known as bucket testing, split testing, or controlled experiment, is a
standard way to evaluate user engagement or satisfaction from a new service, feature, or …

被引用次数：283 相关文章所有 3 个版本

[PDF] washington.edu

From theory to practice: Efficient join query evaluation in a parallel database system

S Chu, M Balazinska, D Suciu - Proceedings of the 2015 ACM SIGMOD …, 2015 - dl.acm.org

Big data analytics often requires processing complex queries using massive parallelism,
where the main performance metrics is the communication cost incurred during data …

被引用次数：150 相关文章所有 5 个版本

[PDF] sciencedirect.com

Towards scalability and data skew handling in groupby-joins using mapreduce model

MAH Hassan, M Bamha - Procedia Computer Science, 2015 - Elsevier

For over a decade, MapReduce has become the leading programming model for parallel
and massive processing of large volumes of data. This has been driven by the development …

被引用次数：19 相关文章所有 8 个版本

Sasm: Improving spark performance with adaptive skew mitigation

J Yu, H Chen, F Hu - … conference on progress in informatics and …, 2015 - ieeexplore.ieee.org

Skew is a common phenomenon widely existing in parallel computing platforms, resulting in
slowing down the entire complete time and many idle resources. We present Spark Adaptive …

被引用次数：16 相关文章

[PDF] tau.ac.il

An efficient MapReduce cube algorithm for varied DataDistributions

T Milo, E Altshuler - Proceedings of the 2016 international conference on …, 2016 - dl.acm.org

Data cubes allow users to discover insights from their data and are commonly used in data
analysis. While very useful, the data cube is expensive to compute, in particular when the …

被引用次数：11 相关文章所有 3 个版本

[HTML] sciencedirect.com

[HTML][HTML] Computing marginals using MapReduce

FN Afrati, S Sharma, JR Ullman, JD Ullman - Journal of Computer and …, 2018 - Elsevier

We consider the problem of computing data-cube marginals by a single round of
MapReduce, focusing on the relationship between the reducer size and the replication rate …

被引用次数：8 相关文章所有 16 个版本

[PDF] tum.de

Materialized views in distributed key-value stores

J Adler - 2020 - mediatum.ub.tum.de

Distributed key-value stores have become the solution of choice for warehousing large
volumes of data. However, their architecture is not suitable for real-time analytics. To …

被引用次数：1 相关文章所有 2 个版本

[PDF] researchgate.net

[PDF][PDF] Scalability and optimisation of groupby-joins in mapreduce

M Bamha, MAH Hassan - Technical report LIFO, Universit´ ed' …, 2015 - researchgate.net

For over a decade, MapReduce has become the leading programming model for parallel
and massive processing of large volumes of data. This has been driven by the development …

被引用次数：2 相关文章所有 2 个版本

[PDF] hal.science

Scalability and Optimisation of GroupBy-Joins in MapReduce Scalability and Optimisation of GroupBy-Joins in MapReduce

M Bamha, MAH Hassan - 2015 - hal.science

For over a decade, MapReduce has become the leading programming model for parallel
and massive processing of large volumes of data. This has been driven by the development …

被引用次数：1 相关文章所有 3 个版本

Computing Marginals Using MapReduce: Keynote talk paper

FN Afrati, S Sharma, JD Ullman… - Proceedings of the 20th …, 2016 - dl.acm.org

We consider the problem of computing the data-cube marginals of a fixed order k (ie, all
marginals that aggregate over k dimensions), using a single round of MapReduce. The …