The anatomy of big data computing
Advances in information technology and its widespread growth in several areas of business,
engineering, medical, and scientific studies are resulting in information/data explosion …
engineering, medical, and scientific studies are resulting in information/data explosion …
Apache spark: a unified engine for big data processing
Apache Spark: a unified engine for big data processing Page 1 56 COMMUNICATIONS OF THE
ACM | NOVEMBER 2016 | VOL. 59 | NO. 11 contributed articles DOI:10.1145/2934664 This …
ACM | NOVEMBER 2016 | VOL. 59 | NO. 11 contributed articles DOI:10.1145/2934664 This …
Security and privacy aspects in MapReduce on clouds: A survey
MapReduce is a programming system for distributed processing of large-scale data in an
efficient and fault tolerant manner on a private, public, or hybrid cloud. MapReduce is …
efficient and fault tolerant manner on a private, public, or hybrid cloud. MapReduce is …
Polylogarithmic-time deterministic network decomposition and distributed derandomization
V Rozhoň, M Ghaffari - Proceedings of the 52nd Annual ACM SIGACT …, 2020 - dl.acm.org
We present a simple polylogarithmic-time deterministic distributed algorithm for network
decomposition. This improves on a celebrated 2 O (√ log n)-time algorithm of Panconesi …
decomposition. This improves on a celebrated 2 O (√ log n)-time algorithm of Panconesi …
Scalable k-means++
Over half a century old and showing no signs of aging, k-means remains one of the most
popular data processing algorithms. As is well-known, a proper initialization of k-means is …
popular data processing algorithms. As is well-known, a proper initialization of k-means is …
Communication steps for parallel query processing
We study the problem of computing conjunctive queries over large databases on parallel
architectures without shared storage. Using the structure of such a query q and the skew in …
architectures without shared storage. Using the structure of such a query q and the skew in …
Massively parallel computation: Algorithms and applications
The algorithms community has been modeling the underlying key features and constraints of
massively parallel frameworks and using these models to discover new algorithmic …
massively parallel frameworks and using these models to discover new algorithmic …
Counting triangles and the curse of the last reducer
S Suri, S Vassilvitskii - Proceedings of the 20th international conference …, 2011 - dl.acm.org
The clustering coefficient of a node in a social network is a fundamental measure that
quantifies how tightly-knit the community is around the node. Its computation can be reduced …
quantifies how tightly-knit the community is around the node. Its computation can be reduced …
The k-clique densest subgraph problem
C Tsourakakis - Proceedings of the 24th international conference on …, 2015 - dl.acm.org
Numerous graph mining applications rely on detecting subgraphs which are large near-
cliques. Since formulations that are geared towards finding large near-cliques are hard and …
cliques. Since formulations that are geared towards finding large near-cliques are hard and …