A survey on data‐efficient algorithms in big data era

A Adadi - Journal of Big Data, 2021 - Springer
The leading approaches in Machine Learning are notoriously data-hungry. Unfortunately,
many application domains do not have access to big data because acquiring data involves a …

A survey of predictive modeling on imbalanced domains

P Branco, L Torgo, RP Ribeiro - ACM computing surveys (CSUR), 2016 - dl.acm.org
Many real-world data-mining applications involve obtaining predictive models using
datasets with strongly imbalanced distributions of the target variable. Frequently, the least …

[PDF][PDF] 基于机器学习的文本分类技术研究进展

苏金树, 张博锋, 徐昕[1 - 软件学报, 2006 - Citeseer
文本自动分类是信息检索与数据挖掘领域的研究热点与核心技术, 近年来得到了广泛的关注和
快速的发展. 提出了基于机器学习的文本分类技术所面临的互联网内容信息处理等复杂应用的 …

[HTML][HTML] Decision tree and random forest models for outcome prediction in antibody incompatible kidney transplantation

T Shaikhina, D Lowe, S Daga, D Briggs… - … Signal Processing and …, 2019 - Elsevier
Clinical datasets are commonly limited in size, thus restraining applications of Machine
Learning (ML) techniques for predictive modelling in clinical research and organ …

Dvqa: Understanding data visualizations via question answering

K Kafle, B Price, S Cohen… - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
Bar charts are an effective way to convey numeric information, but today's algorithms cannot
parse them. Existing methods fail when faced with even minor variations in appearance …

Text classification based on deep belief network and softmax regression

M Jiang, Y Liang, X Feng, X Fan, Z Pei, Y Xue… - Neural Computing and …, 2018 - Springer
In this paper, we propose a novel hybrid text classification model based on deep belief
network and softmax regression. To solve the sparse high-dimensional matrix computation …

Learning curves for decision making in supervised machine learning: A survey

F Mohr, JN van Rijn - Machine Learning, 2024 - Springer
Learning curves are a concept from social sciences that has been adopted in the context of
machine learning to assess the performance of a learning algorithm with respect to a certain …

[HTML][HTML] Handling limited datasets with neural networks in medical applications: A small-data approach

T Shaikhina, NA Khovanova - Artificial intelligence in medicine, 2017 - Elsevier
Motivation Single-centre studies in medical domain are often characterised by limited
samples due to the complexity and high costs of patient data collection. Machine learning …

Using natural language processing to automatically detect self-admitted technical debt

E da Silva Maldonado, E Shihab… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
The metaphor of technical debt was introduced to express the trade off between productivity
and quality, ie, when developers take shortcuts or perform quick hacks. More recently, our …

Combating the small sample class imbalance problem using feature selection

M Wasikowski, X Chen - IEEE Transactions on knowledge and …, 2009 - ieeexplore.ieee.org
The class imbalance problem is encountered in real-world applications of machine learning
and results in a classifier's suboptimal performance. Researchers have rigorously studied …