作者
Niklas Heinemann, Sascha Bub, Jakob Wolfram, Sebastian Stehle, Lara L Petschick, Ralf Schulz
发表日期
2020/12/4
期刊
Data
卷号
5
期号
4
页码范围
114
出版商
MDPI
简介
With an ever-increasing production and registration of chemical substances, obtaining reliable and up to date information on their use types (UT) and chemical class (CC) is of crucial importance. We evaluated the current status of open access chemical substance databases (DBs) regarding UT and CC information using the “Meta-analysis of the Global Impact of Chemicals” (MAGIC) graph as a benchmark. A decision tree-based selection process was used to choose the most suitable out of 96 databases. To compare the DB content for 100 weighted, randomly selected chemical substances, an extensive quantitative and qualitative analysis was performed. It was found that four DBs yielded more qualitative and quantitative UT and CC results than the current MAGIC graph: The European Bioinformatics Institute DB, ChemSpider, the English Wikipedia page, and the National Center for Biotechnology Information (NCBI). The NCBI, along with its subsidiary DBs PubChem and Medical Subject Headings (MeSH), showed the best performance according to the defined criteria. To analyse large datasets, harmonisation of the available information might be beneficial, as the available DBs mostly aggregate information without harmonising them.
Dataset: Chemical Class and Use Type Compendium available online at www.mdpi.com/xxx/s1.
Dataset License: CC-BY-SA.
学术搜索中的文章