Table pre-training: A survey on model architectures, pre-training objectives, and downstream tasks
Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs,
and various other document types, a flurry of table pre-training frameworks have been …
and various other document types, a flurry of table pre-training frameworks have been …
Neurologic a* esque decoding: Constrained text generation with lookahead heuristics
The dominant paradigm for neural text generation is left-to-right decoding from
autoregressive language models. Constrained or controllable generation under complex …
autoregressive language models. Constrained or controllable generation under complex …
ToTTo: A controlled table-to-text generation dataset
We present ToTTo, an open-domain English table-to-text dataset with over 120,000 training
examples that proposes a controlled generation task: given a Wikipedia table and a set of …
examples that proposes a controlled generation task: given a Wikipedia table and a set of …
Large language models are few (1)-shot table reasoners
W Chen - arXiv preprint arXiv:2210.06710, 2022 - arxiv.org
Recent literature has shown that large language models (LLMs) are generally excellent few-
shot reasoners to solve text reasoning tasks. However, the capability of LLMs on table …
shot reasoners to solve text reasoning tasks. However, the capability of LLMs on table …
Chart-to-text: A large-scale benchmark for chart summarization
Charts are commonly used for exploring data and communicating insights. Generating
natural language summaries from charts can be very helpful for people in inferring key …
natural language summaries from charts can be very helpful for people in inferring key …
Dart: Open-domain structured data record to text generation
We present DART, an open domain structured DAta Record to Text generation dataset with
over 82k instances (DARTs). Data-to-Text annotations can be a costly process, especially …
over 82k instances (DARTs). Data-to-Text annotations can be a costly process, especially …
KGPT: Knowledge-grounded pre-training for data-to-text generation
Data-to-text generation has recently attracted substantial interests due to its wide
applications. Existing methods have shown impressive performance on an array of tasks …
applications. Existing methods have shown impressive performance on an array of tasks …
Folio: Natural language reasoning with first-order logic
We present FOLIO, a human-annotated, open-domain, and logically complex and diverse
dataset for reasoning in natural language (NL), equipped with first order logic (FOL) …
dataset for reasoning in natural language (NL), equipped with first order logic (FOL) …
Table understanding: Problem overview
A Shigarov - Wiley Interdisciplinary Reviews: Data Mining and …, 2023 - Wiley Online Library
Tables are probably the most natural way to represent relational data in various media and
formats. They store a large number of valuable facts that could be utilized for question …
formats. They store a large number of valuable facts that could be utilized for question …
Transformers for tabular data representation: A survey of models and applications
In the last few years, the natural language processing community has witnessed advances
in neural representations of free texts with transformer-based language models (LMs). Given …
in neural representations of free texts with transformer-based language models (LMs). Given …