JSI and WüNLP at the DIALECT-COPA Shared Task: In-Context Learning From Just a Few Dialectal Examples Gets You Quite Far
The paper presents the JSI and WüNLP systems submitted to the DIALECT-COPA shared
task on causal commonsense reasoning in dialectal texts. Jointly, we compare LLM-based …
task on causal commonsense reasoning in dialectal texts. Jointly, we compare LLM-based …
Do LLMs learn a true syntactic universal?
J Hale, M Stanojević - Proceedings of the 2024 Conference on …, 2024 - aclanthology.org
Do large multilingual language models learn language universals? We consider a
candidate universal much-discussed in the linguistics literature, the Final-over-Final …
candidate universal much-discussed in the linguistics literature, the Final-over-Final …
ParlaMint II: advancing comparable parliamentary corpora across Europe
The paper presents the results of the ParlaMint II project, which comprise comparable
corpora of parliamentary debates of 29 European countries and autonomous regions …
corpora of parliamentary debates of 29 European countries and autonomous regions …
The parlaspeech collection of automatically generated speech and text datasets from parliamentary proceedings
Recent significant improvements in speech and language technologies come both from self-
supervised approaches over raw language data as well as various types of explicit …
supervised approaches over raw language data as well as various types of explicit …
CLASSLA-web: Comparable Web Corpora of South Slavic Languages Enriched with Linguistic and Genre Annotation
N Ljubešić, T Kuzman - arXiv preprint arXiv:2403.12721, 2024 - arxiv.org
This paper presents a collection of highly comparable web corpora of Slovenian, Croatian,
Bosnian, Montenegrin, Serbian, Macedonian, and Bulgarian, covering thereby the whole …
Bosnian, Montenegrin, Serbian, Macedonian, and Bulgarian, covering thereby the whole …
Slovenian parliamentary corpus siParl
Parliamentary debates represent an essential part of democratic discourse and provide
insights into various socio-demographic and linguistic phenomena-parliamentary corpora …
insights into various socio-demographic and linguistic phenomena-parliamentary corpora …
CLASSLA-Express: a Train of CLARIN. SI Workshops on Language Resources and Tools with Easily Expanding Route
This paper introduces the CLASSLA-Express workshop series as an innovative approach to
disseminating linguistic resources and infrastructure provided by the CLASSLA Knowledge …
disseminating linguistic resources and infrastructure provided by the CLASSLA Knowledge …
Classification of Lyric Poetry Written in Serbian
V Kadić, S Milanović, V Batanović - 2024 32nd …, 2024 - ieeexplore.ieee.org
In terms of natural language processing, Serbian belongs to low-resource languages, with a
small number of available datasets and tools. In this paper, we present a novel poem …
small number of available datasets and tools. In this paper, we present a novel poem …
Dependency parser for Bulgarian
A Atanasov - Proceedings of the Sixth International Conference …, 2024 - aclanthology.org
This paper delves into the implementation of a Biaffine Attention Model, a sophisticated
neural network architecture employed for dependency parsing tasks. Proposed by Dozat …
neural network architecture employed for dependency parsing tasks. Proposed by Dozat …
Gos 2: A New Reference Corpus of Spoken Slovenian
This paper introduces a new version of the Gos reference corpus of spoken Slovenian,
which was recently extended to more than double the original size (300 hours, 2.4 million …
which was recently extended to more than double the original size (300 hours, 2.4 million …