[PDF][PDF] 大数据管理: 概念, 技术与挑战

孟小峰, 慈祥 - 2013 - idke.ruc.edu.cn
大数据管理:概念,技术与挑战 Page 1 大数据管理:概念,技术与挑战 孟小峰慈祥 (中国人民大学信息
学院北京100872) Big Data Management: Concepts, Techniques and Challenges Meng …

The family of mapreduce and large-scale data processing systems

S Sakr, A Liu, AG Fayoumi - ACM Computing Surveys (CSUR), 2013 - dl.acm.org
In the last two decades, the continuous increase of computational power has produced an
overwhelming flow of data which has called for a paradigm shift in the computing …

A comprehensive view of Hadoop research—A systematic literature review

I Polato, R Ré, A Goldman, F Kon - Journal of Network and Computer …, 2014 - Elsevier
Context: In recent years, the valuable knowledge that can be retrieved from petabyte scale
datasets–known as Big Data–led to the development of solutions to process information …

Big data: A survey

M Chen, S Mao, Y Liu - Mobile networks and applications, 2014 - Springer
In this paper, we review the background and state-of-the-art of big data. We first introduce
the general background of big data and review related technologies, such as could …

Toward scalable systems for big data analytics: A technology tutorial

H Hu, Y Wen, TS Chua, X Li - IEEE access, 2014 - ieeexplore.ieee.org
Recent technological advancements have led to a deluge of data from distinctive domains
(eg, health care and scientific sensors, user-generated data, Internet and financial …

Discretized streams: Fault-tolerant streaming computation at scale

M Zaharia, T Das, H Li, T Hunter, S Shenker… - Proceedings of the …, 2013 - dl.acm.org
Many" big data" applications must act on data in real time. Running these applications at
ever-larger scales requires parallel platforms that automatically handle faults and stragglers …

Resilient distributed datasets: A {Fault-Tolerant} abstraction for {In-Memory} cluster computing

M Zaharia, M Chowdhury, T Das, A Dave, J Ma… - 9th USENIX symposium …, 2012 - usenix.org
We present Resilient Distributed Datasets (RDDs), a distributed memory abstraction that lets
programmers perform in-memory computations on large clusters in a fault-tolerant manner …

Discretized streams: an efficient and {Fault-Tolerant} model for stream processing on large clusters

M Zaharia, T Das, H Li, S Shenker, I Stoica - 4th USENIX Workshop on …, 2012 - usenix.org
Many important “big data” applications need to process data arriving in real time. However,
current programming models for distributed stream processing are relatively low-level, often …

T-storm: Traffic-aware online scheduling in storm

J Xu, Z Chen, J Tang, S Su - 2014 IEEE 34th International …, 2014 - ieeexplore.ieee.org
Storm has emerged as a promising computation platform for stream data processing. In this
paper, we first show inefficiencies of the current practice of Storm scheduling and challenges …

Kineograph: taking the pulse of a fast-changing and connected world

R Cheng, J Hong, A Kyrola, Y Miao, X Weng… - Proceedings of the 7th …, 2012 - dl.acm.org
Kineograph is a distributed system that takes a stream of incoming data to construct a
continuously changing graph, which captures the relationships that exist in the data feed. As …