Machine knowledge: Creation and curation of comprehensive knowledge bases

G Weikum, XL Dong, S Razniewski… - … and Trends® in …, 2021 - nowpublishers.com
Equipping machines with comprehensive knowledge of the world's entities and their
relationships has been a longstanding goal of AI. Over the last decade, large-scale …

[HTML][HTML] Deep learning for fake news detection: A comprehensive survey

L Hu, S Wei, Z Zhao, B Wu - AI open, 2022 - Elsevier
The information age enables people to obtain news online through various channels, yet in
the meanwhile making false news spread at unprecedented speed. Fake news exerts …

Autoregressive entity retrieval

N De Cao, G Izacard, S Riedel, F Petroni - arXiv preprint arXiv:2010.00904, 2020 - arxiv.org
Entities are at the center of how we represent and aggregate knowledge. For instance,
Encyclopedias such as Wikipedia are structured by entities (eg, one per Wikipedia article) …

Knowledge enhanced contextual word representations

ME Peters, M Neumann, RL Logan IV… - arXiv preprint arXiv …, 2019 - arxiv.org
Contextual word representations, typically trained on unstructured, unlabeled text, do not
contain any explicit grounding to real world entities and are often unable to remember facts …

Scalable zero-shot entity linking with dense entity retrieval

L Wu, F Petroni, M Josifoski, S Riedel… - arXiv preprint arXiv …, 2019 - arxiv.org
This paper introduces a conceptually simple, scalable, and highly effective BERT-based
entity linking model, along with an extensive evaluation of its accuracy-speed trade-off. We …

Cm3: A causal masked multimodal model of the internet

A Aghajanyan, B Huang, C Ross, V Karpukhin… - arXiv preprint arXiv …, 2022 - arxiv.org
We introduce CM3, a family of causally masked generative models trained over a large
corpus of structured multi-modal documents that can contain both text and image tokens …

Towards complex text-to-sql in cross-domain database with intermediate representation

J Guo, Z Zhan, Y Gao, Y Xiao, JG Lou, T Liu… - arXiv preprint arXiv …, 2019 - arxiv.org
We present a neural approach called IRNet for complex and cross-domain Text-to-SQL.
IRNet aims to address two challenges: 1) the mismatch between intents expressed in natural …

MultiFC: A real-world multi-domain dataset for evidence-based fact checking of claims

I Augenstein, C Lioma, D Wang, LC Lima… - arXiv preprint arXiv …, 2019 - arxiv.org
We contribute the largest publicly available dataset of naturally occurring factual claims for
the purpose of automatic claim verification. It is collected from 26 fact checking websites in …

Language models are open knowledge graphs

C Wang, X Liu, D Song - arXiv preprint arXiv:2010.11967, 2020 - arxiv.org
This paper shows how to construct knowledge graphs (KGs) from pre-trained language
models (eg, BERT, GPT-2/3), without human supervision. Popular KGs (eg, Wikidata, NELL) …

Named entity extraction for knowledge graphs: A literature overview

T Al-Moslmi, MG Ocaña, AL Opdahl, C Veres - IEEE Access, 2020 - ieeexplore.ieee.org
An enormous amount of digital information is expressed as natural-language (NL) text that is
not easily processable by computers. Knowledge Graphs (KG) offer a widely used format for …