Knowledge graphs
In this article, we provide a comprehensive introduction to knowledge graphs, which have
recently garnered significant attention from both industry and academia in scenarios that …
recently garnered significant attention from both industry and academia in scenarios that …
Web data extraction, applications and techniques: A survey
Abstract Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many approaches to extracting …
different scientific tools and in a broad range of applications. Many approaches to extracting …
Information extraction from text
J Jiang - Mining text data, 2012 - Springer
Abstract Information extraction is the task of finding structured information from unstructured
or semi-structured text. It is an important task in text mining and has been extensively studied …
or semi-structured text. It is an important task in text mining and has been extensively studied …
Rousillon: Scraping distributed hierarchical web data
SE Chasins, M Mueller, R Bodik - Proceedings of the 31st Annual ACM …, 2018 - dl.acm.org
Programming by Demonstration (PBD) promises to enable data scientists to collect web
data. However, in formative interviews with social scientists, we learned that current PBD …
data. However, in formative interviews with social scientists, we learned that current PBD …
Websrc: A dataset for web-based structural reading comprehension
Web search is an essential way for humans to obtain information, but it's still a great
challenge for machines to understand the contents of web pages. In this paper, we introduce …
challenge for machines to understand the contents of web pages. In this paper, we introduce …
[HTML][HTML] Synthesis of multilevel knowledge graphs: Methods and technologies for dynamic networks
Abstract Knowledge Graphs is one of the most popular techniques for knowledge-based
modelling in various subdomains of modern AI technologies ranging from natural language …
modelling in various subdomains of modern AI technologies ranging from natural language …
Joint optimization of wrapper generation and template detection
Many websites have large collections of pages generated dynamically from an underlying
structured source like a database. The data of a category are typically encoded into similar …
structured source like a database. The data of a category are typically encoded into similar …
Scalable web data extraction for online market intelligence
R Baumgartner, G Gottlob, M Herzog - Proceedings of the VLDB …, 2009 - dl.acm.org
Online market intelligence (OMI), in particular competitive intelligence for product pricing, is
a very important application area for Web data extraction. However, OMI presents non-trivial …
a very important application area for Web data extraction. However, OMI presents non-trivial …
Supervised and unsupervised methods for robust separation of section titles and prose text in web documents
The text in many web documents is organized into a hierarchy of section titles and
corresponding prose content, a structure which provides potentially exploitable information …
corresponding prose content, a structure which provides potentially exploitable information …
[PDF][PDF] Comparison of python libraries used for web data extraction
There are several libraries for extracting useful data from web pages in Python. In this study,
we compare three different well-known extraction libraries including BeautifulSoup, lxml and …
we compare three different well-known extraction libraries including BeautifulSoup, lxml and …