The ParlaMint corpora of parliamentary proceedings

T Erjavec, M Ogrodniczuk, P Osenova… - Language resources …, 2023 - Springer
This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17
European national parliaments with half a billion words. The corpora are uniformly encoded …

ParlaMint II: The show must go on

M Ogrodniczuk, P Osenova, T Erjavec… - Proceedings of the …, 2022 - aclanthology.org
Abstract In ParlaMint I, a CLARIN-ERIC supported project in pandemic times, a set of
comparable and uniformly annotated multilingual corpora for 17 national parliaments were …

ParlaMint II: advancing comparable parliamentary corpora across Europe

T Erjavec, M Kopp, N Ljubešić, T Kuzman… - Language Resources …, 2024 - Springer
The paper presents the results of the ParlaMint II project, which comprise comparable
corpora of parliamentary debates of 29 European countries and autonomous regions …

The parlaspeech collection of automatically generated speech and text datasets from parliamentary proceedings

N Ljubešić, P Rupnik, D Koržinek - International Conference on Speech …, 2024 - Springer
Recent significant improvements in speech and language technologies come both from self-
supervised approaches over raw language data as well as various types of explicit …

Annotating Attribution in Czech News Server Articles

B Hladká, J Mírovský, M Kopp… - Proceedings of the …, 2022 - aclanthology.org
This paper focuses on detection of sources in the Czech articles published on a news server
of Czech public radio. In particular, we search for attribution in sentences and we recognize …

Adding the Basque parliament corpus to ParlaMint project

J Alkorta, MI Quintian - … of the Workshop ParlaCLARIN III within the …, 2022 - aclanthology.org
The aim of this work is to describe the colection created with transcript of the Basque
parliamentary speeches. This corpus follows the constraints of the ParlaMint project. The …

Speech-Informed Inverse Text Normalization

V Stankov - 2024 - dspace.cuni.cz
In the domain of Automatic Speech Recognition (ASR), Inverse Text Normalization (ITN) is
applied after the speech recognition step to transform recognized verbalized text into written …

[PDF][PDF] Text-to-Speech Personalization

M Luner - 2022 - excel.fit.vutbr.cz
This work aims to develop a model that can convert Czech input text into speech that closely
resembles a target speaker. The chosen approach involves training a base text-to-speech …

[PDF][PDF] GREC: Multi-domain Speech Recognition for the Greek

GG Rouvalis - 2023 - pergamos.lib.uoa.gr
One of the leading challenges in Automatic Speech Recognition (ASR) is the development
of robust systems that can perform well under multiple settings. In this work we construct and …

[PDF][PDF] Workflow and Metadata Challenges in the ParlaMint Project: Insights from Building the ParlaMint-UA Corpus

A Kryvenko, M Kopp - CLARIN Annual Conference Proceedings, 2023 - helda.helsinki.fi
This paper focuses on the challenges of refining the workflow for collecting and adding
metadata to the ParlaMint corpora designed for research in the social sciences and …