Mathbert: A pre-trained model for mathematical formula understanding

S Peng, K Yuan, L Gao, Z Tang - arXiv preprint arXiv:2105.00377, 2021 - arxiv.org
Large-scale pre-trained models like BERT, have obtained a great success in various Natural
Language Processing (NLP) tasks, while it is still a challenge to adapt them to the math …

Tangent-CFT: An embedding model for mathematical formulas

B Mansouri, S Rohatgi, DW Oard, J Wu… - Proceedings of the …, 2019 - dl.acm.org
When searching for mathematical content, accurate measures of formula similarity can help
with tasks such as document ranking, query recommendation, and result set clustering …

Introduction to mathematical language processing: Informal proofs, word problems, and supporting tasks

J Meadows, A Freitas - Transactions of the Association for …, 2023 - direct.mit.edu
Automating discovery in mathematics and science will require sophisticated methods of
information extraction and abstract reasoning, including models that can convincingly …

Mathematical Information Retrieval: A Review

P Dadure, P Pakray, S Bandyopadhyay - ACM Computing Surveys, 2024 - dl.acm.org
Mathematical formulas are commonly used to demonstrate theories and basic fundamentals
in the Science, Technology, Engineering, and Mathematics (STEM) domain. The burgeoning …

[PDF][PDF] NTCIR-12 MathIR Task Overview.

R Zanibbi, A Aizawa, M Kohlhase, I Ounis, Goran Topic… - NTCIR, 2016 - research.nii.ac.jp
We present an overview of the NTCIR-12 MathIR Task, dedicated to information access for
mathematical content. The MathIR task makes use of two corpora. The first corpus contains …

Evaluating token-level and passage-level dense retrieval models for math information retrieval

W Zhong, JH Yang, Y Xie, J Lin - arXiv preprint arXiv:2203.11163, 2022 - arxiv.org
With the recent success of dense retrieval methods based on bi-encoders, studies have
applied this approach to various interesting downstream retrieval tasks with good efficiency …

Layout and semantics: Combining representations for mathematical formula search

K Davila, R Zanibbi - Proceedings of the 40th International ACM SIGIR …, 2017 - dl.acm.org
Math-aware search engines need to support formulae in queries. Mathematical expressions
are typically represented as trees defining their operational semantics or visual layout. We …

Accelerating substructure similarity search for formula retrieval

W Zhong, S Rohatgi, J Wu, CL Giles… - Advances in Information …, 2020 - Springer
Formula retrieval systems using substructure matching are effective, but suffer from slow
retrieval times caused by the complexity of structure matching. We present a specialized …

One blade for one purpose: advancing math information retrieval using hybrid search

W Zhong, SC Lin, JH Yang, J Lin - … of the 46th International ACM SIGIR …, 2023 - dl.acm.org
Neural retrievers have been shown to be effective for math-aware search. Their ability to
cope with math symbol mismatches, to represent highly contextualized semantics, and to …

A survey in mathematical language processing

J Meadows, A Freitas - arXiv preprint arXiv:2205.15231, 2022 - arxiv.org
Informal mathematical text underpins real-world quantitative reasoning and communication.
Developing sophisticated methods of retrieval and abstraction from this dual modality is …