Crowdspeech and voxdiy: Benchmark datasets for crowdsourced audio transcription

N Pavlichenko, I Stelmakh, D Ustalov - arXiv preprint arXiv:2107.01091, 2021 - arxiv.org
Domain-specific data is the crux of the successful transfer of machine learning systems from
benchmarks to real life. In simple problems such as image classification, crowdsourcing has …

Federated iot interaction vulnerability analysis

G Wang, H Guo, A Li, X Liu… - 2023 IEEE 39th …, 2023 - ieeexplore.ieee.org
IoT devices provide users with great convenience in smart homes. However, the
interdependent behaviors across devices may yield unexpected interactions. To analyze the …

Demystifying Artificial Intelligence for Data Preparation

C Chai, N Tang, J Fan, Y Luo - … of the 2023 International Conference on …, 2023 - dl.acm.org
Data preparation--the process of discovering, integrating, transforming, cleaning, and
annotating data--is one of the oldest, hardest, yet inevitable data management problems …

Quality of sentiment analysis tools: The reasons of inconsistency

WM Kouadri, M Ouziri, S Benbernou… - Proceedings of the …, 2020 - dl.acm.org
In this paper, we present a comprehensive study that evaluates six state-of-the-art sentiment
analysis tools on five public datasets, based on the quality of predictive results in the …

Coca: Cost-effective collaborative annotation system by combining experts and amateurs

J Lei, Z Zhang, L Zhang, XY Li - 2022 IEEE 38th International …, 2022 - ieeexplore.ieee.org
Data annotation has been a key boost for the artificial intelligence. However, difficult tasks
such as fine-grained classification need lots of labeled data to train a feasible model. On the …

Type diversity maximization aware coursewares crowdcollection with limited budget in MOOCs

L Guo, Y Jin, G Liu, F Hao, M Ren, V Loia - Information Sciences, 2023 - Elsevier
Massive open online courses (MOOCs) require coursewares with different types of course
resources recommended to learners based on their learning situations to meet personalized …

REGROW: Reimagining global crowdsourcing for better human-AI collaboration

A Alorwu, S Savage, N van Berkel, D Ustalov… - CHI Conference on …, 2022 - dl.acm.org
Crowdworkers silently enable much of today's AI-based products, with several online
platforms offering a myriad of data labelling and content moderation tasks through …

Lessons Learned from a Citizen Science Project for Natural Language Processing

JC Klie, JU Lee, K Stowe, GG Şahin… - arXiv preprint arXiv …, 2023 - arxiv.org
Many Natural Language Processing (NLP) systems use annotated corpora for training and
evaluation. However, labeled data is often costly to obtain and scaling annotation projects is …

Efficient Online Crowdsourcing with Complex Annotations

R Meir, VA Nguyen, X Chen, J Ramakrishnan… - Proceedings of the …, 2024 - ojs.aaai.org
Crowdsourcing platforms use various truth discovery algorithms to aggregate annotations
from multiple labelers. In an online setting, however, the main challenge is to decide …

[PDF][PDF] MACRO: Incentivizing Multi-leader Game-based Pareto-efficient Crowdsourcing for Video Analytics

Y Chen, S Zhang, Z Zhou, X Wang, Y Liang… - Proc. of the 40th IEEE …, 2024 - cis.temple.edu
In recent years, many crowdsourcing platforms have emerged, using the resources of
recruited workers to perform diverse outsourcing tasks, where the video analytics attracts …