Pre-trained language models in biomedical domain: A systematic survey

B Wang, Q Xie, J Pei, Z Chen, P Tiwari, Z Li… - ACM Computing …, 2023 - dl.acm.org
Pre-trained language models (PLMs) have been the de facto paradigm for most natural
language processing tasks. This also benefits the biomedical domain: researchers from …

Unexplored therapeutic opportunities in the human genome

TI Oprea, CG Bologa, S Brunak, A Campbell… - Nature reviews Drug …, 2018 - nature.com
A large proportion of biomedical research and the development of therapeutics is focused
on a small fraction of the human genome. In a strategic effort to map the knowledge gaps …

Galactica: A large language model for science

R Taylor, M Kardas, G Cucurull, T Scialom… - arXiv preprint arXiv …, 2022 - arxiv.org
Information overload is a major obstacle to scientific progress. The explosive growth in
scientific literature and data has made it ever harder to discover useful insights in a large …

A knowledge graph to interpret clinical proteomics data

A Santos, AR Colaço, AB Nielsen, L Niu… - Nature …, 2022 - nature.com
Implementing precision medicine hinges on the integration of omics data, such as
proteomics, into the clinical decision-making process, but the quantity and diversity of …

BioBERT: a pre-trained biomedical language representation model for biomedical text mining

J Lee, W Yoon, S Kim, D Kim, S Kim, CH So… - …, 2020 - academic.oup.com
Motivation Biomedical text mining is becoming increasingly important as the number of
biomedical documents rapidly grows. With the progress in natural language processing …

miRBase: from microRNA sequences to function

A Kozomara, M Birgaoanu… - Nucleic acids …, 2019 - academic.oup.com
Abstract miRBase catalogs, names and distributes microRNA gene sequences. The latest
release of miRBase (v22) contains microRNA sequences from 271 organisms: 38 589 …

Pretrained language models for biomedical and clinical tasks: understanding and extending the state-of-the-art

P Lewis, M Ott, J Du, V Stoyanov - Proceedings of the 3rd clinical …, 2020 - aclanthology.org
A large array of pretrained models are available to the biomedical NLP (BioNLP) community.
Finding the best model for a particular task can be difficult and time-consuming. For many …

DrugCentral 2021 supports drug discovery and repositioning

S Avram, CG Bologa, J Holmes, G Bocci… - Nucleic acids …, 2021 - academic.oup.com
DrugCentral is a public resource (http://drugcentral. org) that serves the scientific community
by providing up-to-date drug information, as described in previous papers. The current …

Deep learning with word embeddings improves biomedical named entity recognition

M Habibi, L Weber, M Neves, DL Wiegandt… - …, 2017 - academic.oup.com
Motivation Text mining has become an important tool for biomedical research. The most
fundamental text-mining task is the recognition of biomedical named entities (NER), such as …

The SIDER database of drugs and side effects

M Kuhn, I Letunic, LJ Jensen, P Bork - Nucleic acids research, 2016 - academic.oup.com
Unwanted side effects of drugs are a burden on patients and a severe impediment in the
development of new drugs. At the same time, adverse drug reactions (ADRs) recorded …