作者
Sascha Bub, Jakob Wolfram, Sebastian Stehle, Lara L Petschick, Ralf Schulz
发表日期
2019/2/22
期刊
Data
卷号
4
期号
1
页码范围
34
出版商
MDPI
简介
Assessing the impact of chemicals on the environment and addressing subsequent issues are two central challenges to their safe use. Environmental data are continuously expanding, requiring flexible, scalable, and extendable data management solutions that can harmonize multiple data sources with potentially differing nomenclatures or levels of specificity. Here, we present the methodological steps taken to construct a rule-based labeled property graph database, the “Meta-analysis of the Global Impact of Chemicals” (MAGIC) graph, for potential environmental impact chemicals (PEIC) and its subsequent application harmonizing multiple large-scale databases. The resulting data encompass 16,739 unique PEICs attributed to their corresponding chemical class, stereo-chemical information, valid synonyms, use types, unique identifiers (e.g., Chemical Abstract Service registry number CAS RN), and others. These data provide researchers with additional chemical information for a large amount of PEICs and can also be publicly accessed using a web interface. Our analysis has shown that data harmonization can increase up to 98% when using the MAGIC graph approach compared to relational data systems for datasets with different nomenclatures. The graph database system and its data appear more suitable for large-scale analysis where traditional (i.e., relational) data systems are reaching conceptional limitations.
Dataset: The dataset can be found in Supplementary Materials, www.mdpi.com/xxx/s1.
Dataset License: CC-BY-SA
引用总数
2019202020212022202320243211
学术搜索中的文章