Tortured phrases: A dubious writing style emerging in science. Evidence of critical issues affecting established journals

G Cabanac, C Labbé, A Magazinov - arXiv preprint arXiv:2107.06751, 2021 - arxiv.org
Probabilistic text generators have been used to produce fake scientific papers for more than
a decade. Such nonsensical papers are easily detected by both human and machine. Now …

[HTML][HTML] Duplicate and fake publications in the scientific literature: how many SCIgen papers in computer science?

C Labbé, D Labbé - Scientometrics, 2013 - Springer
Two kinds of bibliographic tools are used to retrieve scientific publications and make them
available online. For one kind, access is free as they store information made publicly …

[HTML][HTML] Comparing the topological properties of real and artificially generated scientific manuscripts

DR Amancio - Scientometrics, 2015 - Springer
Recent years have witnessed the increase of competition in science. While promoting the
quality of research in many cases, an intense competition among scientists can also trigger …

Prevalence of nonsensical algorithmically generated papers in the scientific literature

G Cabanac, C Labbé - Journal of the Association for …, 2021 - Wiley Online Library
In 2014 leading publishers withdrew more than 120 nonsensical publications automatically
generated with the SCIgen program. Casual observations suggested that similar …

Fast nonparametric estimation of class proportions in the positive-unlabeled classification setting

D Zeiberg, S Jain, P Radivojac - … of the AAAI Conference on Artificial …, 2020 - ojs.aaai.org
Estimating class proportions has emerged as an important direction in positive-unlabeled
learning. Well-estimated class priors are key to accurate approximation of posterior …

Literary detective work on the computer

MP Oakes - 2014 - torrossa.com
Computer stylometry is the computer analysis of writing style. This enables inferences to be
made, especially about the sometimes disputed provenance of texts, but also about the …

Towards Improved Scientific Knowledge Proliferation: Leveraging Large Language Models on the Traditional Scientific Writing Workflow

T Procko, A Davidoff, T Elvira… - Available at SSRN …, 2023 - papers.ssrn.com
Abstract Technological advances in Natural Language Processing have brought forth
language models capable of advanced response delivery. For humans, inputting natural …

Measuring conference quality by mining program committee characteristics

Z Zhuang, E Elmacioglu, D Lee, CL Giles - … of the 7th ACM/IEEE-CS joint …, 2007 - dl.acm.org
Bibliometrics are important measures for venue quality in digital libraries. Impacts of venues
are usually the major consideration for subscription decision-making, and for ranking and …

Finite-Sample Bounds for Two-Distribution Hypothesis Tests

C Hom, W Yik, GD Montañez - 2023 IEEE 10th International …, 2023 - ieeexplore.ieee.org
With the rapid growth of large language models, big data, and malicious online attacks, it
has become increasingly important to have tools for anomaly detection that can distinguish …

Robust trait-specific essay scoring using neural networks and density estimators

K Taghipour - 2017 - search.proquest.com
We have proposed a novel approach to automated essay scoring based on recurrent and
convolutional neural networks. Unlike existing systems, our approach does not rely on …