Algorithms for detecting significantly mutated pathways in cancer

F Vandin, E Upfal, BJ Raphael - Journal of Computational Biology, 2011 - liebertpub.com
Recent genome sequencing studies have shown that the somatic mutations that drive
cancer development are distributed across a large number of genes. This mutational …

Northstar: An interactive data science system

T Kraska - 2021 - dspace.mit.edu
© 2018 VLDB Endowment. In order to democratize data science, we need to fundamentally
rethink the current analytics stack, from the user interface to the “guts.“Most importantly …

Sliceline: Fast, linear-algebra-based slice finding for ml model debugging

S Sagadeeva, M Boehm - … of the 2021 international conference on …, 2021 - dl.acm.org
Slice finding---a recent work on debugging machine learning (ML) models---aims to find the
top-K data slices (eg, conjunctions of predicates such as gender female and degree PhD) …

Discovering highly reliable subgraphs in uncertain graphs

R Jin, L Liu, CC Aggarwal - Proceedings of the 17th ACM SIGKDD …, 2011 - dl.acm.org
In this paper, we investigate the highly reliable subgraph problem, which arises in the
context of uncertain graphs. This problem attempts to identify all induced subgraphs for …

A structured view on pattern mining-based biclustering

R Henriques, C Antunes, SC Madeira - Pattern Recognition, 2015 - Elsevier
Mining matrices to find relevant biclusters, subsets of rows exhibiting a coherent pattern over
a subset of columns, is a critical task for a wide-set of biomedical and social applications …

[HTML][HTML] Ontology-based data interestingness: A state-of-the-art review

CB Abhilash, K Mahesh - Natural Language Processing Journal, 2023 - Elsevier
In recent years, there has been significant growth in the use of ontology-based methods to
enhance data interestingness. These methods play a crucial role in knowledge …

Significance-based discriminative sequential pattern mining

Z He, S Zhang, J Wu - Expert Systems with Applications, 2019 - Elsevier
Discriminative sequential patterns are sub-sequences whose occurrences exhibit significant
differences across sequential data sets with different class labels. The discovery of such …

BSig: evaluating the statistical significance of biclustering solutions

R Henriques, SC Madeira - Data Mining and Knowledge Discovery, 2018 - Springer
Statistical evaluation of biclustering solutions is essential to guarantee the absence of
spurious relations and to validate the high number of scientific statements inferred from …

Interestingness measures for association rules based on statistical validity

INM Shaharanee, F Hadzic, TS Dillon - Knowledge-Based Systems, 2011 - Elsevier
Assessing rules with interestingness measures is the pillar of successful application of
association rules discovery. However, association rules discovered are normally large in …

Discovering significant patterns under sequential false discovery control

S Dalleiger, J Vreeken - Proceedings of the 28th ACM SIGKDD …, 2022 - dl.acm.org
We are interested in discovering those patterns from data with an empirical frequency that is
significantly differently than expected. To avoid spurious results, yet achieve high statistical …