Strubert: Structure-aware bert for table search and matching

M Trabelsi, Z Chen, S Zhang, BD Davison… - Proceedings of the ACM …, 2022 - dl.acm.org
A table is composed of data values that are organized in rows and columns providing
implicit structural information. A table is usually accompanied by secondary information such …

Neural ranking models for document retrieval

M Trabelsi, Z Chen, BD Davison, J Heflin - Information Retrieval Journal, 2021 - Springer
Ranking models are the main components of information retrieval systems. Several
approaches to ranking are based on traditional machine learning algorithms using a set of …

Keeping the data lake in form: proximity mining for pre-filtering schema matching

A Alserafi, A Abelló, O Romero, T Calders - ACM Transactions on …, 2020 - dl.acm.org
Data lakes (DLs) are large repositories of raw datasets from disparate sources. As more
datasets are ingested into a DL, there is an increasing need for efficient techniques to profile …

Leveraging schema labels to enhance dataset search

Z Chen, H Jia, J Heflin, BD Davison - … on IR Research, ECIR 2020, Lisbon …, 2020 - Springer
A search engine's ability to retrieve desirable datasets is important for data sharing and
reuse. Existing dataset search engines typically rely on matching queries to dataset …

Improved table retrieval using multiple context embeddings for attributes

M Trabelsi, BD Davison, J Heflin - 2019 IEEE international …, 2019 - ieeexplore.ieee.org
Table retrieval is the task of extracting the most relevant tables to answer a user's query.
Table retrieval is an important task because many domains have tables that contain useful …

SeLaB: Semantic labeling with BERT

M Trabelsi, J Cao, J Heflin - 2021 International Joint …, 2021 - ieeexplore.ieee.org
Generating schema labels automatically for column values of data tables has many data
science applications such as schema matching, and data discovery and linking. For …

Artificial intelligence for ocean science data integration: current state, gaps, and way forward

T Sagi, Y Lehahn, K Bar - Elem Sci Anth, 2020 - online.ucpress.edu
Oceanographic research is a multidisciplinary endeavor that involves the acquisition of an
increasing amount of in-situ and remotely sensed data. A large and growing number of …

WTR: A test collection for web table retrieval

Z Chen, S Zhang, BD Davison - … of the 44th International ACM SIGIR …, 2021 - dl.acm.org
We describe the development, characteristics and availability of a test collection for the task
of Web table retrieval, which uses a large-scale Web Table Corpora extracted from the …

Tab2KG: Semantic table interpretation with lightweight semantic profiles

S Gottschalk, E Demidova - Semantic Web, 2022 - content.iospress.com
Tabular data plays an essential role in many data analytics and machine learning tasks.
Typically, tabular data does not possess any machine-readable semantics. In this context …

A hybrid deep model for learning to rank data tables

M Trabelsi, Z Chen, BD Davison… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
We address the problem of ad hoc table retrieval via a new neural architecture that
incorporates both semantic and relevance matching. Understanding the connection …