Data integration challenges for machine learning in precision medicine
M Martínez-García, E Hernández-Lemus - Frontiers in medicine, 2022 - frontiersin.org
A main goal of Precision Medicine is that of incorporating and integrating the vast corpora on
different databases about the molecular and environmental origins of disease, into analytic …
different databases about the molecular and environmental origins of disease, into analytic …
Spatial patterns of CTCF sites define the anatomy of TADs and their boundaries
Abstract Background Topologically associating domains (TADs) are genomic regions of self-
interaction. Additionally, it is known that TAD boundaries are enriched in CTCF binding sites …
interaction. Additionally, it is known that TAD boundaries are enriched in CTCF binding sites …
GenoSurf: metadata driven semantic search system for integrated genomic datasets
A Canakoglu, A Bernasconi, A Colombo… - Database, 2019 - academic.oup.com
Many valuable resources developed by world-wide research institutions and consortia
describe genomic datasets that are both open and available for secondary research, but …
describe genomic datasets that are both open and available for secondary research, but …
Framing Apache Spark in life sciences
Advances in high-throughput and digital technologies have required the adoption of big data
for handling complex tasks in life sciences. However, the drift to big data led researchers to …
for handling complex tasks in life sciences. However, the drift to big data led researchers to …
META-BASE: a novel architecture for large-scale genomic metadata integration
A Bernasconi, A Canakoglu… - … /ACM Transactions on …, 2020 - ieeexplore.ieee.org
The integration of genomic metadata is, at the same time, an important, difficult, and well-
recognized challenge. It is important because a wealth of public data repositories is …
recognized challenge. It is important because a wealth of public data repositories is …
GeCoAgent: a conversational agent for empowering genomic data extraction and analysis
With the availability of reliable and low-cost DNA sequencing, human genomics is relevant
to a growing number of end-users, including biologists and clinicians. Typical interactions …
to a growing number of end-users, including biologists and clinicians. Typical interactions …
GeMI: interactive interface for transformer-based Genomic Metadata Integration
Abstract The Gene Expression Omnibus (GEO) is a public archive containing> 4 million
digital samples from functional genomics experiments collected over almost two decades …
digital samples from functional genomics experiments collected over almost two decades …
OpenGDC: unifying, modeling, integrating cancer genomic data and clinical metadata
Next Generation Sequencing technologies have produced a substantial increase of publicly
available genomic data and related clinical/biospecimen information. New models and …
available genomic data and related clinical/biospecimen information. New models and …
Genomic data integration and user-defined sample-set extraction for population variant analysis
Background Population variant analysis is of great importance for gathering insights into the
links between human genotype and phenotype. The 1000 Genomes Project established a …
links between human genotype and phenotype. The 1000 Genomes Project established a …
RGMQL: scalable and interoperable computing of heterogeneous omics big data and metadata in R/Bioconductor
S Pallotta, S Cascianelli, M Masseroli - BMC bioinformatics, 2022 - Springer
Background Heterogeneous omics data, increasingly collected through high-throughput
technologies, can contain hidden answers to very important and still unsolved biomedical …
technologies, can contain hidden answers to very important and still unsolved biomedical …