The ParlaMint corpora of parliamentary proceedings
This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17
European national parliaments with half a billion words. The corpora are uniformly encoded …
European national parliaments with half a billion words. The corpora are uniformly encoded …
ParlaMint II: The show must go on
Abstract In ParlaMint I, a CLARIN-ERIC supported project in pandemic times, a set of
comparable and uniformly annotated multilingual corpora for 17 national parliaments were …
comparable and uniformly annotated multilingual corpora for 17 national parliaments were …
ParlaMint II: advancing comparable parliamentary corpora across Europe
The paper presents the results of the ParlaMint II project, which comprise comparable
corpora of parliamentary debates of 29 European countries and autonomous regions …
corpora of parliamentary debates of 29 European countries and autonomous regions …
The parlaspeech collection of automatically generated speech and text datasets from parliamentary proceedings
Recent significant improvements in speech and language technologies come both from self-
supervised approaches over raw language data as well as various types of explicit …
supervised approaches over raw language data as well as various types of explicit …
Annotating Attribution in Czech News Server Articles
This paper focuses on detection of sources in the Czech articles published on a news server
of Czech public radio. In particular, we search for attribution in sentences and we recognize …
of Czech public radio. In particular, we search for attribution in sentences and we recognize …
Adding the Basque parliament corpus to ParlaMint project
J Alkorta, MI Quintian - … of the Workshop ParlaCLARIN III within the …, 2022 - aclanthology.org
The aim of this work is to describe the colection created with transcript of the Basque
parliamentary speeches. This corpus follows the constraints of the ParlaMint project. The …
parliamentary speeches. This corpus follows the constraints of the ParlaMint project. The …
Speech-Informed Inverse Text Normalization
V Stankov - 2024 - dspace.cuni.cz
In the domain of Automatic Speech Recognition (ASR), Inverse Text Normalization (ITN) is
applied after the speech recognition step to transform recognized verbalized text into written …
applied after the speech recognition step to transform recognized verbalized text into written …
[PDF][PDF] Text-to-Speech Personalization
M Luner - 2022 - excel.fit.vutbr.cz
This work aims to develop a model that can convert Czech input text into speech that closely
resembles a target speaker. The chosen approach involves training a base text-to-speech …
resembles a target speaker. The chosen approach involves training a base text-to-speech …
[PDF][PDF] GREC: Multi-domain Speech Recognition for the Greek
GG Rouvalis - 2023 - pergamos.lib.uoa.gr
One of the leading challenges in Automatic Speech Recognition (ASR) is the development
of robust systems that can perform well under multiple settings. In this work we construct and …
of robust systems that can perform well under multiple settings. In this work we construct and …
[PDF][PDF] Workflow and Metadata Challenges in the ParlaMint Project: Insights from Building the ParlaMint-UA Corpus
A Kryvenko, M Kopp - CLARIN Annual Conference Proceedings, 2023 - helda.helsinki.fi
This paper focuses on the challenges of refining the workflow for collecting and adding
metadata to the ParlaMint corpora designed for research in the social sciences and …
metadata to the ParlaMint corpora designed for research in the social sciences and …