Indexing highly repetitive string collections, part II: Compressed indexes

G Navarro - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …

A space-optimal grammar compression

Y Takabatake, H Sakamoto - 25th Annual European …, 2017 - drops.dagstuhl.de
A grammar compression is a context-free grammar (CFG) deriving a single string
deterministically. For an input string of length N over an alphabet of size sigma, the smallest …

Practical random access to SLP-compressed texts

T Gagie, TI, G Manzini, G Navarro, H Sakamoto… - … Symposium on String …, 2020 - Springer
Grammar-based compression is a popular and powerful approach to compressing repetitive
texts but until recently its relatively poor time-space trade-offs during real-life construction …

Indexing highly repetitive string collections

G Navarro - arXiv preprint arXiv:2004.02781, 2020 - arxiv.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …

Practical string dictionary compression using string dictionary encoding

S Kanda, K Morita, M Fuketa - 2017 International Conference …, 2017 - ieeexplore.ieee.org
A string dictionary is a data structure for storing a set of strings that maps them to unique IDs.
It can manage string data in compact space by encoding them into integers. However …

Grammar-compressed self-index with Lyndon words

K Tsuruta, D Köppl, Y Nakashima, S Inenaga… - arXiv preprint arXiv …, 2020 - arxiv.org
We introduce a new class of straight-line programs (SLPs), named the Lyndon SLP, inspired
by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data …

LZD Factorization: Simple and Practical Online Grammar Compression with Variable-to-Fixed Encoding

K Goto, H Bannai, S Inenaga, M Takeda - … 2015, Ischia Island, Italy, June 29 …, 2015 - Springer
We propose a new variant of the LZ78 factorization which we call the LZ Double-factor
factorization (LZD factorization). Each factor of the LZD factorization of a string is the …

Fully online grammar compression in constant space

S Maruyama, Y Tabei - 2014 Data Compression Conference, 2014 - ieeexplore.ieee.org
We present novel variants of fully online LCA (FOLCA), a fully online grammar compression
that builds a straight line program (SLP) and directly encodes it into a succinct …

Online self-indexed grammar compression

Y Takabatake, Y Tabei, H Sakamoto - … 2015, London, UK, September 1-4 …, 2015 - Springer
Although several grammar-based self-indexes have been proposed thus far, their
applicability is limited to offline settings where whole input texts are prepared, thus requiring …

Online grammar transformation based on Re-Pair algorithm

T Masaki, T Kida - 2016 Data Compression Conference (DCC), 2016 - ieeexplore.ieee.org
The Re-Pair algorithm (Re-Pair), proposed by Larsson and Moat, is a simple grammar-
based compression method that achieves a good compression ratio. Although Re-Pair runs …