ParCzech 3.0: A large Czech speech corpus with rich metadata

T Erjavec, M Ogrodniczuk, P Osenova… - Language resources …, 2023 - Springer

This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17
European national parliaments with half a billion words. The corpora are uniformly encoded …

被引用次数：113 相关文章所有 14 个版本

[PDF] aclanthology.org

ParlaMint II: The show must go on

M Ogrodniczuk, P Osenova, T Erjavec… - Proceedings of the …, 2022 - aclanthology.org

Abstract In ParlaMint I, a CLARIN-ERIC supported project in pandemic times, a set of
comparable and uniformly annotated multilingual corpora for 17 national parliaments were …

被引用次数：11 相关文章所有 7 个版本

[PDF] springer.com

ParlaMint II: advancing comparable parliamentary corpora across Europe

T Erjavec, M Kopp, N Ljubešić, T Kuzman… - Language Resources …, 2024 - Springer

The paper presents the results of the ParlaMint II project, which comprise comparable
corpora of parliamentary debates of 29 European countries and autonomous regions …

The parlaspeech collection of automatically generated speech and text datasets from parliamentary proceedings

N Ljubešić, P Rupnik, D Koržinek - International Conference on Speech …, 2024 - Springer

Recent significant improvements in speech and language technologies come both from self-
supervised approaches over raw language data as well as various types of explicit …

被引用次数：1 相关文章所有 4 个版本

[PDF] aclanthology.org

Annotating Attribution in Czech News Server Articles

B Hladká, J Mírovský, M Kopp… - Proceedings of the …, 2022 - aclanthology.org

This paper focuses on detection of sources in the Czech articles published on a news server
of Czech public radio. In particular, we search for attribution in sentences and we recognize …

被引用次数：2 相关文章所有 4 个版本

[PDF] aclanthology.org

Adding the Basque parliament corpus to ParlaMint project

J Alkorta, MI Quintian - … of the Workshop ParlaCLARIN III within the …, 2022 - aclanthology.org

The aim of this work is to describe the colection created with transcript of the Basque
parliamentary speeches. This corpus follows the constraints of the ParlaMint project. The …

被引用次数：4 相关文章所有 12 个版本

[PDF] cuni.cz

Speech-Informed Inverse Text Normalization

V Stankov - 2024 - dspace.cuni.cz

In the domain of Automatic Speech Recognition (ASR), Inverse Text Normalization (ITN) is
applied after the speech recognition step to transform recognized verbalized text into written …

[PDF] vutbr.cz

[PDF][PDF] Text-to-Speech Personalization

M Luner - 2022 - excel.fit.vutbr.cz

This work aims to develop a model that can convert Czech input text into speech that closely
resembles a target speaker. The chosen approach involves training a base text-to-speech …

被引用次数：1 相关文章

[PDF] uoa.gr

[PDF][PDF] GREC: Multi-domain Speech Recognition for the Greek

GG Rouvalis - 2023 - pergamos.lib.uoa.gr

One of the leading challenges in Automatic Speech Recognition (ASR) is the development
of robust systems that can perform well under multiple settings. In this work we construct and …

[PDF] helsinki.fi

[PDF][PDF] Workflow and Metadata Challenges in the ParlaMint Project: Insights from Building the ParlaMint-UA Corpus

A Kryvenko, M Kopp - CLARIN Annual Conference Proceedings, 2023 - helda.helsinki.fi

This paper focuses on the challenges of refining the workflow for collecting and adding
metadata to the ParlaMint corpora designed for research in the social sciences and …

被引用次数：3 相关文章所有 3 个版本