Sparse and skew hashing of k-mers
GE Pibiri - Bioinformatics, 2022 - academic.oup.com
Motivation A dictionary of k-mers is a data structure that stores a set of n distinct k-mers and
supports membership queries. This data structure is at the hearth of many important tasks in …
supports membership queries. This data structure is at the hearth of many important tasks in …
Proactively identifying emerging hacker threats from the dark web: A diachronic graph embedding framework (d-gef)
Cybersecurity experts have appraised the total global cost of malicious hacking activities to
be $450 billion annually. Cyber Threat Intelligence (CTI) has emerged as a viable approach …
be $450 billion annually. Cyber Threat Intelligence (CTI) has emerged as a viable approach …
PTHash: Revisiting FCH minimal perfect hashing
Given a set S of n distinct keys, a function f that bijectively maps the keys of S into the range
(0,..., n-1) is called a minimal perfect hash function for S. Algorithms that find such functions …
(0,..., n-1) is called a minimal perfect hash function for S. Algorithms that find such functions …
SAT-Geo: A social sensing based content-only approach to geolocating abnormal traffic events using syntax-based probabilistic learning
Social sensing has become an emerging and pervasive sensing paradigm to collect timely
observations of the physical world from human sensors. In this paper, we study the problem …
observations of the physical world from human sensors. In this paper, we study the problem …
Locality-preserving minimal perfect hashing of k-mers
Motivation Minimal perfect hashing is the problem of mapping a static set of n distinct keys
into the address space {1,…, n} bijectively. It is well-known that n log 2 (e) bits are necessary …
into the address space {1,…, n} bijectively. It is well-known that n log 2 (e) bits are necessary …
Deep learning from physicochemical information of concrete with an artificial language for property prediction and reaction discovery
Existing machine learning-based approaches to investigate and design concrete mainly use
the mixture design variables to predict concrete properties and do not consider the …
the mixture design variables to predict concrete properties and do not consider the …
On weighted k-mer dictionaries
GE Pibiri - Algorithms for Molecular Biology, 2023 - Springer
We consider the problem of representing a set of k-mers and their abundance counts, or
weights, in compressed space so that assessing membership and retrieving the weight of ak …
weights, in compressed space so that assessing membership and retrieving the weight of ak …
Parallel and external-memory construction of minimal perfect hash functions with PTHash
A function is a minimal perfect hash function for a set of size, if bijectively maps into the first
natural numbers. These functions are important for many practical applications in computing …
natural numbers. These functions are important for many practical applications in computing …
HyperEmbed: Tradeoffs between resources and performance in NLP tasks with hyperdimensional computing enabled embedding of n-gram statistics
Recent advances in Deep Learning have led to a significant performance increase on
several NLP tasks, however, the models become more and more computationally …
several NLP tasks, however, the models become more and more computationally …
DFSMN-T: 结合强语言模型Transformer 的中文语音识别.
胡章芳, 蹇芳, 唐珊珊, 明子平… - Journal of Computer …, 2022 - search.ebscohost.com
自动语音识别系统由声学模型和语言模型两部分构成, 但传统语言模型N-gram
存在忽略词条语义相似性, 参数过大等问题, 限制了语音识别字符错误率的进一步降低 …
存在忽略词条语义相似性, 参数过大等问题, 限制了语音识别字符错误率的进一步降低 …