Data warehouse systems

A Vaisman, E Zimányi - Data-Centric Systems and Applications, 2014 - Springer
Since the late 1970s, relational database technology has been adopted by most
organizations to store their essential data. However, nowadays, the needs of these …

[HTML][HTML] Incremental knowledge base construction using deepdive

J Shin, S Wu, F Wang, C De Sa… - Proceedings of the …, 2015 - ncbi.nlm.nih.gov
Populating a database with unstructured information is a long-standing problem in industry
and research that encompasses problems of extraction, cleaning, and integration. Recent …

Jaql: A scripting language for large scale semistructured data analysis

KS Beyer, V Ercegovac, R Gemulla, A Balmin… - Proceedings of the …, 2011 - dl.acm.org
This paper describes Jaql, a declarative scripting language for analyzing large
semistructured datasets in parallel using Hadoop's MapReduce framework. Jaql is currently …

From information to knowledge: harvesting entities and relationships from web sources

G Weikum, M Theobald - Proceedings of the twenty-ninth ACM SIGMOD …, 2010 - dl.acm.org
There are major trends to advance the functionality of search engines to a more expressive
semantic level. This is enabled by the advent of knowledge-sharing communities such as …

Facilitating knowledge sharing from domain experts to data scientists for building nlp models

S Park, AY Wang, B Kawas, QV Liao… - Proceedings of the 26th …, 2021 - dl.acm.org
Data scientists face a steep learning curve in understanding a new domain for which they
want to build machine learning (ML) models. While input from domain experts could offer …

[HTML][HTML] PREDOSE: a semantic web platform for drug abuse epidemiology using social media

D Cameron, GA Smith, R Daniulaityte, AP Sheth… - Journal of biomedical …, 2013 - Elsevier
Objectives The role of social media in biomedical knowledge mining, including clinical,
medical and healthcare informatics, prescription drug abuse epidemiology and drug …

Deepdive: Declarative knowledge base construction

C De Sa, A Ratner, C Ré, J Shin, F Wang, S Wu… - ACM SIGMOD …, 2016 - dl.acm.org
The dark data extraction or knowledge base construction (KBC) problem is to populate a
SQL database with information from unstructured data sources including emails, webpages …

Document spanners: A formal approach to information extraction

R Fagin, B Kimelfeld, F Reiss… - Journal of the ACM (JACM …, 2015 - dl.acm.org
An intrinsic part of information extraction is the creation and manipulation of relations
extracted from text. In this article, we develop a foundational framework where the central …

A machine reading system for assembling synthetic paleontological databases

SE Peters, C Zhang, M Livny, C Ré - PLoS one, 2014 - journals.plos.org
Many aspects of macroevolutionary theory and our understanding of biotic responses to
global environmental change derive from literature-based compilations of paleontological …

DeepDive: a data management system for automatic knowledge base construction

C Zhang - 2015 - search.proquest.com
Many pressing questions in science are macroscopic: they require scientists to consult
information expressed in a wide range of resources, many of which are not organized in a …