Fusing data with correlations

A Ratner, SH Bach, H Ehrenberg, J Fries… - Proceedings of the …, 2017 - ncbi.nlm.nih.gov

Labeling training data is increasingly the largest bottleneck in deploying machine learning
systems. We present Snorkel, a first-of-its-kind system that enables users to train state-of-the …

被引用次数：1038 相关文章所有 20 个版本

[HTML] springer.com

[HTML][HTML] Snorkel: rapid training data creation with weak supervision

A Ratner, SH Bach, H Ehrenberg, J Fries, S Wu, C Ré - The VLDB Journal, 2020 - Springer

Labeling training data is increasingly the largest bottleneck in deploying machine learning
systems. We present Snorkel, a first-of-its-kind system that enables users to train state-of-the …

被引用次数：307 相关文章所有 10 个版本

[PDF] arxiv.org

A survey on truth discovery

Y Li, J Gao, C Meng, Q Li, L Su, B Zhao… - ACM Sigkdd …, 2016 - dl.acm.org

Thanks to information explosion, data for the objects of interest can be collected from
increasingly more sources. However, for the same object, there usually exist conflicts among …

被引用次数：534 相关文章所有 11 个版本

[PDF] vldb.org

Big data integration

XL Dong, D Srivastava - 2013 IEEE 29th international …, 2013 - ieeexplore.ieee.org

The Big Data era is upon us: data is being generated, collected and analyzed at an
unprecedented scale, and data-driven decision making is sweeping through all aspects of …

被引用次数：806 相关文章所有 18 个版本

[PDF] tandfonline.com

Debugging inputs

L Kirschner, E Soremekun, A Zeller - Proceedings of the ACM/IEEE 42nd …, 2020 - dl.acm.org

When a program fails to process an input, it need not be the program code that is at fault. It
can also be that the input data is faulty, for instance as result of data corruption. To get the …

被引用次数：10916 相关文章所有 82 个版本

[HTML] nih.gov

[HTML][HTML] Snuba: Automating weak supervision to label training data

P Varma, C Ré - … of the VLDB Endowment. International Conference …, 2018 - ncbi.nlm.nih.gov

As deep learning models are applied to increasingly diverse problems, a key bottleneck is
gathering enough high-quality training labels tailored to each task. Users therefore turn to …

被引用次数：191 相关文章所有 6 个版本

[PDF] arxiv.org

Knowledge-based trust: Estimating the trustworthiness of web sources

XL Dong, E Gabrilovich, K Murphy, V Dang… - arXiv preprint arXiv …, 2015 - arxiv.org

The quality of web sources has been traditionally evaluated using exogenous signals such
as the hyperlink structure of the graph. We propose a new approach that relies on …

被引用次数：321 相关文章所有 26 个版本

[PDF] arxiv.org

Truth discovery algorithms: An experimental evaluation

DA Waguih, L Berti-Equille - arXiv preprint arXiv:1409.6428, 2014 - arxiv.org

A fundamental problem in data fusion is to determine the veracity of multi-source data in
order to resolve conflicts. While previous work in truth discovery has proved to be useful in …

被引用次数：85 相关文章所有 6 个版本

[PDF] arxiv.org

From data fusion to knowledge fusion

XL Dong, E Gabrilovich, G Heitz, W Horn… - arXiv preprint arXiv …, 2015 - arxiv.org

The task of {\em data fusion} is to identify the true values of data items (eg, the true date of
birth for {\em Tom Cruise}) among multiple observed values drawn from different sources …

被引用次数：341 相关文章所有 19 个版本

[PDF] tsinghua.edu.cn

QASCA: A quality-aware task assignment system for crowdsourcing applications

Y Zheng, J Wang, G Li, R Cheng, J Feng - Proceedings of the 2015 ACM …, 2015 - dl.acm.org

A crowdsourcing system, such as the Amazon Mechanical Turk (AMT), provides a platform
for a large number of questions to be answered by Internet workers. Such systems have …

被引用次数：233 相关文章所有 19 个版本

[HTML][HTML] Snorkel: Rapid training data creation with weak supervision

[HTML][HTML] Snorkel: rapid training data creation with weak supervision

A survey on truth discovery

Big data integration

Debugging inputs

[HTML][HTML] Snuba: Automating weak supervision to label training data

Knowledge-based trust: Estimating the trustworthiness of web sources

Truth discovery algorithms: An experimental evaluation

From data fusion to knowledge fusion

QASCA: A quality-aware task assignment system for crowdsourcing applications

高级搜索

引用