Indexing highly repetitive string collections, part II: Compressed indexes
G Navarro - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …
represent them within their compressed space while at the same time offering indexed …
A space-optimal grammar compression
Y Takabatake, H Sakamoto - 25th Annual European …, 2017 - drops.dagstuhl.de
A grammar compression is a context-free grammar (CFG) deriving a single string
deterministically. For an input string of length N over an alphabet of size sigma, the smallest …
deterministically. For an input string of length N over an alphabet of size sigma, the smallest …
Practical random access to SLP-compressed texts
Grammar-based compression is a popular and powerful approach to compressing repetitive
texts but until recently its relatively poor time-space trade-offs during real-life construction …
texts but until recently its relatively poor time-space trade-offs during real-life construction …
Indexing highly repetitive string collections
G Navarro - arXiv preprint arXiv:2004.02781, 2020 - arxiv.org
Two decades ago, a breakthrough in indexing string collections made it possible to
represent them within their compressed space while at the same time offering indexed …
represent them within their compressed space while at the same time offering indexed …
Practical string dictionary compression using string dictionary encoding
S Kanda, K Morita, M Fuketa - 2017 International Conference …, 2017 - ieeexplore.ieee.org
A string dictionary is a data structure for storing a set of strings that maps them to unique IDs.
It can manage string data in compact space by encoding them into integers. However …
It can manage string data in compact space by encoding them into integers. However …
Grammar-compressed self-index with Lyndon words
We introduce a new class of straight-line programs (SLPs), named the Lyndon SLP, inspired
by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data …
by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data …
LZD Factorization: Simple and Practical Online Grammar Compression with Variable-to-Fixed Encoding
We propose a new variant of the LZ78 factorization which we call the LZ Double-factor
factorization (LZD factorization). Each factor of the LZD factorization of a string is the …
factorization (LZD factorization). Each factor of the LZD factorization of a string is the …
Fully online grammar compression in constant space
S Maruyama, Y Tabei - 2014 Data Compression Conference, 2014 - ieeexplore.ieee.org
We present novel variants of fully online LCA (FOLCA), a fully online grammar compression
that builds a straight line program (SLP) and directly encodes it into a succinct …
that builds a straight line program (SLP) and directly encodes it into a succinct …
Online self-indexed grammar compression
Y Takabatake, Y Tabei, H Sakamoto - … 2015, London, UK, September 1-4 …, 2015 - Springer
Although several grammar-based self-indexes have been proposed thus far, their
applicability is limited to offline settings where whole input texts are prepared, thus requiring …
applicability is limited to offline settings where whole input texts are prepared, thus requiring …
Online grammar transformation based on Re-Pair algorithm
T Masaki, T Kida - 2016 Data Compression Conference (DCC), 2016 - ieeexplore.ieee.org
The Re-Pair algorithm (Re-Pair), proposed by Larsson and Moat, is a simple grammar-
based compression method that achieves a good compression ratio. Although Re-Pair runs …
based compression method that achieves a good compression ratio. Although Re-Pair runs …