A survey of high dimension low sample size asymptotics
Peter Hall's work illuminated many aspects of statistical thought, some of which are very well
known including the bootstrap and smoothing. However, he also explored many other lesser …
known including the bootstrap and smoothing. However, he also explored many other lesser …
A gamma kernel density estimation for insurance loss data
Y Jeon, JHT Kim - Insurance: Mathematics and Economics, 2013 - Elsevier
Fitting insurance loss data can be challenging because of their non-negativity, asymmetry,
skewness, and possible multi-modality. Though many parametric models have been used in …
skewness, and possible multi-modality. Though many parametric models have been used in …
Distance-based outlier detection for high dimension, low sample size data
Despite the popularity of high dimension, low sample size data analysis, there has not been
enough attention to the sample integrity issue, in particular, a possibility of outliers in the …
enough attention to the sample integrity issue, in particular, a possibility of outliers in the …
General sparse multi-class linear discriminant analysis
Discrimination with high dimensional data is often more effectively done with sparse
methods that use a fraction of predictors rather than using all the available ones. In recent …
methods that use a fraction of predictors rather than using all the available ones. In recent …
Geometric insights into support vector machine behavior using the KKT conditions
I Carmichael, JS Marron - arXiv preprint arXiv:1704.00767, 2017 - arxiv.org
The support vector machine (SVM) is a powerful and widely used classification algorithm.
This paper uses the Karush-Kuhn-Tucker conditions to provide rigorous mathematical proof …
This paper uses the Karush-Kuhn-Tucker conditions to provide rigorous mathematical proof …
High-dimensional linear discriminant analysis using nonparametric methods
The classification of high-dimensional data is a very important problem that has been
studied for a long time. Many studies have proposed linear classifiers based on Fisher's …
studied for a long time. Many studies have proposed linear classifiers based on Fisher's …
Double data piling for heterogeneous covariance models
In this work, we characterize two data piling phenomenon for a high-dimensional binary
classification problem with heterogeneous covariance models. The data piling refers to the …
classification problem with heterogeneous covariance models. The data piling refers to the …
Data-adaptive binary classifiers in high dimensions using random partitioning
Classification in high dimensions has been highlighted for the past two decades since
Fisher's linear discriminant analysis (LDA) is not optimal in a smaller sample size n …
Fisher's linear discriminant analysis (LDA) is not optimal in a smaller sample size n …
Sparse semiparametric discriminant analysis for high-dimensional zero-inflated data
HC Chung, Y Ni, I Gaynanova - arXiv preprint arXiv:2208.03734, 2022 - arxiv.org
Sequencing-based technologies provide an abundance of high-dimensional biological
datasets with skewed and zero-inflated measurements. Classification of such data with …
datasets with skewed and zero-inflated measurements. Classification of such data with …
Double data piling leads to perfect classification
Data piling refers to the phenomenon that training data vectors from each class project to a
single point for classification. While this interesting phenomenon has been a key to …
single point for classification. While this interesting phenomenon has been a key to …