A model of computation for MapReduce

R Kune, PK Konugurthi, A Agarwal… - Software: Practice …, 2016 - Wiley Online Library

Advances in information technology and its widespread growth in several areas of business,
engineering, medical, and scientific studies are resulting in information/data explosion …

被引用次数：308 相关文章所有 15 个版本

[PDF] acm.org Full View

Apache spark: a unified engine for big data processing

M Zaharia, RS Xin, P Wendell, T Das… - Communications of the …, 2016 - dl.acm.org

Apache Spark: a unified engine for big data processing Page 1 56 COMMUNICATIONS OF THE
ACM | NOVEMBER 2016 | VOL. 59 | NO. 11 contributed articles DOI:10.1145/2934664 This …

被引用次数：3168 相关文章所有 15 个版本

[PDF] arxiv.org

Security and privacy aspects in MapReduce on clouds: A survey

P Derbeko, S Dolev, E Gudes, S Sharma - Computer science review, 2016 - Elsevier

MapReduce is a programming system for distributed processing of large-scale data in an
efficient and fault tolerant manner on a private, public, or hybrid cloud. MapReduce is …

被引用次数：111 相关文章所有 13 个版本

[PDF] arxiv.org

Polylogarithmic-time deterministic network decomposition and distributed derandomization

V Rozhoň, M Ghaffari - Proceedings of the 52nd Annual ACM SIGACT …, 2020 - dl.acm.org

We present a simple polylogarithmic-time deterministic distributed algorithm for network
decomposition. This improves on a celebrated 2 O (√ log n)-time algorithm of Panconesi …

被引用次数：197 相关文章所有 7 个版本

[PDF] arxiv.org

Scalable k-means++

B Bahmani, B Moseley, A Vattani, R Kumar… - arXiv preprint arXiv …, 2012 - arxiv.org

Over half a century old and showing no signs of aging, k-means remains one of the most
popular data processing algorithms. As is well-known, a proper initialization of k-means is …

被引用次数：990 相关文章所有 29 个版本

[PDF] acm.org

Communication steps for parallel query processing

P Beame, P Koutris, D Suciu - Journal of the ACM (JACM), 2017 - dl.acm.org

We study the problem of computing conjunctive queries over large databases on parallel
architectures without shared storage. Using the structure of such a query q and the skew in …

被引用次数：375 相关文章所有 17 个版本

[PDF] nowpublishers.com

Massively parallel computation: Algorithms and applications

S Im, R Kumar, S Lattanzi, B Moseley… - … and Trends® in …, 2023 - nowpublishers.com

The algorithms community has been modeling the underlying key features and constraints of
massively parallel frameworks and using these models to discover new algorithmic …

被引用次数：6 相关文章所有 7 个版本

[PDF] aclanthology.org

[图书][B] Data-intensive text processing with MapReduce

J Lin, C Dyer - 2022 - books.google.com

Our world is being revolutionized by data-driven methods: access to large amounts of data
has generated new insights and opened exciting new opportunities in commerce, science …

被引用次数：931 相关文章所有 38 个版本

[PDF] researchgate.net

Counting triangles and the curse of the last reducer

S Suri, S Vassilvitskii - Proceedings of the 20th international conference …, 2011 - dl.acm.org

The clustering coefficient of a node in a social network is a fundamental measure that
quantifies how tightly-knit the community is around the node. Its computation can be reduced …

被引用次数：544 相关文章所有 17 个版本

[PDF] psu.edu

The k-clique densest subgraph problem

C Tsourakakis - Proceedings of the 24th international conference on …, 2015 - dl.acm.org

Numerous graph mining applications rely on detecting subgraphs which are large near-
cliques. Since formulations that are geared towards finding large near-cliques are hard and …

被引用次数：274 相关文章所有 6 个版本