LogLens: A real-time log analysis system

B Debnath, M Solaimani, MAG Gulzar… - 2018 IEEE 38th …, 2018 - ieeexplore.ieee.org
Administrators of most user-facing systems depend on periodic log data to get an idea of the
health and status of production applications. Logs report information, which is crucial to …

G-miner: an efficient task-oriented graph mining system

H Chen, M Liu, Y Zhao, X Yan, D Yan… - Proceedings of the …, 2018 - dl.acm.org
Graph mining is one of the most important areas in data mining. However, scalable solutions
for graph mining are still lacking as existing studies focus on sequential algorithms. While …

Reachability and time-based path queries in temporal graphs

H Wu, Y Huang, J Cheng, J Li… - 2016 IEEE 32nd …, 2016 - ieeexplore.ieee.org
A temporal graph is a graph in which vertices communicate with each other at specific time,
eg, A calls B at 11 am and talks for 7 minutes, which is modeled by an edge from A to B with …

Rheem: enabling cross-platform data processing: may the big data be with you!

D Agrawal, S Chawla, B Contreras-Rojas… - Proceedings of the …, 2018 - dl.acm.org
Solving business problems increasingly requires going beyond the limits of a single data
processing platform (platform for short), such as Hadoop or a DBMS. As a result …

Flexps: Flexible parallelism control in parameter server architecture

Y Huang, T Jin, Y Wu, Z Cai, X Yan, F Yang… - Proceedings of the …, 2018 - dl.acm.org
As a general abstraction for coordinating the distributed storage and access of model
parameters, the parameter server (PS) architecture enables distributed machine learning to …

An experimental comparison of partitioning strategies in distributed graph processing

S Verma - 2017 - ideals.illinois.edu
In this thesis, we study the problem of choosing among partitioning strategies in distributed
graph processing systems. To this end, we evaluate and characterize both the performance …

AdaptDB: adaptive partitioning for distributed joins

Y Lu - 2017 - dspace.mit.edu
Big data analytics often involves complex join queries over two or more tables. Such join
processing is expensive in a distributed setting both because large amounts of data must be …

Type-safe dynamic placement with first-class placed values

G Zakhour, P Weisenburger… - Proceedings of the ACM on …, 2023 - dl.acm.org
Several distributed programming language solutions have been proposed to reason about
the placement of data, computations, and peers interaction. Such solutions include, among …

Improving resource utilization by timely fine-grained scheduling

T Jin, Z Cai, B Li, C Zheng, G Jiang… - Proceedings of the …, 2020 - dl.acm.org
Monotask is a unit of work that uses only a single type of resource (eg, CPU, network, disk
I/O). While monotask was primarily introduced as a means to reason about job performance …

DRPS: efficient disk-resident parameter servers for distributed machine learning

Z Song, Y Gu, Z Wang, G Yu - Frontiers of Computer Science, 2022 - Springer
Parameter server (PS) as the state-of-the-art distributed framework for large-scale iterative
machine learning tasks has been extensively studied. However, existing PS-based systems …