[PDF][PDF] 大数据管理: 概念, 技术与挑战
孟小峰, 慈祥 - 2013 - idke.ruc.edu.cn
大数据管理:概念,技术与挑战 Page 1 大数据管理:概念,技术与挑战 孟小峰慈祥 (中国人民大学信息
学院北京100872) Big Data Management: Concepts, Techniques and Challenges Meng …
学院北京100872) Big Data Management: Concepts, Techniques and Challenges Meng …
The family of mapreduce and large-scale data processing systems
In the last two decades, the continuous increase of computational power has produced an
overwhelming flow of data which has called for a paradigm shift in the computing …
overwhelming flow of data which has called for a paradigm shift in the computing …
A comprehensive view of Hadoop research—A systematic literature review
Context: In recent years, the valuable knowledge that can be retrieved from petabyte scale
datasets–known as Big Data–led to the development of solutions to process information …
datasets–known as Big Data–led to the development of solutions to process information …
Big data: A survey
In this paper, we review the background and state-of-the-art of big data. We first introduce
the general background of big data and review related technologies, such as could …
the general background of big data and review related technologies, such as could …
Toward scalable systems for big data analytics: A technology tutorial
Recent technological advancements have led to a deluge of data from distinctive domains
(eg, health care and scientific sensors, user-generated data, Internet and financial …
(eg, health care and scientific sensors, user-generated data, Internet and financial …
Discretized streams: Fault-tolerant streaming computation at scale
Many" big data" applications must act on data in real time. Running these applications at
ever-larger scales requires parallel platforms that automatically handle faults and stragglers …
ever-larger scales requires parallel platforms that automatically handle faults and stragglers …
Resilient distributed datasets: A {Fault-Tolerant} abstraction for {In-Memory} cluster computing
We present Resilient Distributed Datasets (RDDs), a distributed memory abstraction that lets
programmers perform in-memory computations on large clusters in a fault-tolerant manner …
programmers perform in-memory computations on large clusters in a fault-tolerant manner …
Discretized streams: an efficient and {Fault-Tolerant} model for stream processing on large clusters
Many important “big data” applications need to process data arriving in real time. However,
current programming models for distributed stream processing are relatively low-level, often …
current programming models for distributed stream processing are relatively low-level, often …
T-storm: Traffic-aware online scheduling in storm
Storm has emerged as a promising computation platform for stream data processing. In this
paper, we first show inefficiencies of the current practice of Storm scheduling and challenges …
paper, we first show inefficiencies of the current practice of Storm scheduling and challenges …
Kineograph: taking the pulse of a fast-changing and connected world
Kineograph is a distributed system that takes a stream of incoming data to construct a
continuously changing graph, which captures the relationships that exist in the data feed. As …
continuously changing graph, which captures the relationships that exist in the data feed. As …