A survey of high dimension low sample size asymptotics

M Aoshima, D Shen, H Shen, K Yata… - Australian & New …, 2018 - Wiley Online Library
Peter Hall's work illuminated many aspects of statistical thought, some of which are very well
known including the bootstrap and smoothing. However, he also explored many other lesser …

A gamma kernel density estimation for insurance loss data

Y Jeon, JHT Kim - Insurance: Mathematics and Economics, 2013 - Elsevier
Fitting insurance loss data can be challenging because of their non-negativity, asymmetry,
skewness, and possible multi-modality. Though many parametric models have been used in …

Distance-based outlier detection for high dimension, low sample size data

J Ahn, MH Lee, JA Lee - Journal of Applied Statistics, 2019 - Taylor & Francis
Despite the popularity of high dimension, low sample size data analysis, there has not been
enough attention to the sample integrity issue, in particular, a possibility of outliers in the …

General sparse multi-class linear discriminant analysis

SE Safo, J Ahn - Computational Statistics & Data Analysis, 2016 - Elsevier
Discrimination with high dimensional data is often more effectively done with sparse
methods that use a fraction of predictors rather than using all the available ones. In recent …

Geometric insights into support vector machine behavior using the KKT conditions

I Carmichael, JS Marron - arXiv preprint arXiv:1704.00767, 2017 - arxiv.org
The support vector machine (SVM) is a powerful and widely used classification algorithm.
This paper uses the Karush-Kuhn-Tucker conditions to provide rigorous mathematical proof …

High-dimensional linear discriminant analysis using nonparametric methods

H Park, S Baek, J Park - Journal of Multivariate Analysis, 2022 - Elsevier
The classification of high-dimensional data is a very important problem that has been
studied for a long time. Many studies have proposed linear classifiers based on Fisher's …

Double data piling for heterogeneous covariance models

T Kim, J Ahn, S Jung - arXiv preprint arXiv:2211.15562, 2022 - arxiv.org
In this work, we characterize two data piling phenomenon for a high-dimensional binary
classification problem with heterogeneous covariance models. The data piling refers to the …

Data-adaptive binary classifiers in high dimensions using random partitioning

V Andalib, S Baek - Journal of Statistical Computation and …, 2024 - Taylor & Francis
Classification in high dimensions has been highlighted for the past two decades since
Fisher's linear discriminant analysis (LDA) is not optimal in a smaller sample size n …

Sparse semiparametric discriminant analysis for high-dimensional zero-inflated data

HC Chung, Y Ni, I Gaynanova - arXiv preprint arXiv:2208.03734, 2022 - arxiv.org
Sequencing-based technologies provide an abundance of high-dimensional biological
datasets with skewed and zero-inflated measurements. Classification of such data with …

Double data piling leads to perfect classification

W Chang, J Ahn, S Jung - Electronic Journal of Statistics, 2021 - projecteuclid.org
Data piling refers to the phenomenon that training data vectors from each class project to a
single point for classification. While this interesting phenomenon has been a key to …