Dataset discovery and exploration: A survey
Data scientists are tasked with obtaining insights from data. However, suitable data is often
not immediately at hand, and there may be many potentially relevant datasets in a data lake …
not immediately at hand, and there may be many potentially relevant datasets in a data lake …
Deep transfer learning & beyond: Transformer language models in information systems research
R Gruetzemacher, D Paradice - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
AI is widely thought to be poised to transform business, yet current perceptions of the scope
of this transformation may be myopic. Recent progress in natural language processing …
of this transformation may be myopic. Recent progress in natural language processing …
PASTA: table-operations aware fact verification via sentence-table cloze pre-training
Fact verification has attracted a lot of research attention recently, eg, in journalism,
marketing, and policymaking, as misinformation and disinformation online can sway one's …
marketing, and policymaking, as misinformation and disinformation online can sway one's …
Pretrained generalized autoregressive model with adaptive probabilistic label clusters for extreme multi-label text classification
Extreme multi-label text classification (XMTC) is a task for tagging a given text with the most
relevant labels from an extremely large label set. We propose a novel deep learning method …
relevant labels from an extremely large label set. We propose a novel deep learning method …
Deepjoin: Joinable table discovery with pre-trained language models
Due to the usefulness in data enrichment for data analysis tasks, joinable table discovery
has become an important operation in data lake management. Existing approaches target …
has become an important operation in data lake management. Existing approaches target …
Strubert: Structure-aware bert for table search and matching
A table is composed of data values that are organized in rows and columns providing
implicit structural information. A table is usually accompanied by secondary information such …
implicit structural information. A table is usually accompanied by secondary information such …
Retrieving complex tables with multi-granular graph representation learning
The task of natural language table retrieval (NLTR) seeks to retrieve semantically relevant
tables based on natural language queries. Existing learning systems for this task often treat …
tables based on natural language queries. Existing learning systems for this task often treat …
Neural ranking models for document retrieval
Ranking models are the main components of information retrieval systems. Several
approaches to ranking are based on traditional machine learning algorithms using a set of …
approaches to ranking are based on traditional machine learning algorithms using a set of …
Is table retrieval a solved problem? exploring join-aware multi-table retrieval
Retrieving relevant tables containing the necessary information to accurately answer a given
question over tables is critical to open-domain question-answering (QA) systems. Previous …
question over tables is critical to open-domain question-answering (QA) systems. Previous …
Mixed-modality representation learning and pre-training for joint table-and-text retrieval in openqa
Retrieving evidences from tabular and textual resources is essential for open-domain
question answering (OpenQA), which provides more comprehensive information. However …
question answering (OpenQA), which provides more comprehensive information. However …