Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models
Anonymity of both natural and legal persons in court rulings is a critical aspect of privacy
protection in the European Union and Switzerland. With the advent of LLMs, concerns about …
protection in the European Union and Switzerland. With the advent of LLMs, concerns about …
Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect
Creating neural text encoders for written Swiss German is challenging due to a dearth of
training data combined with dialectal variation. In this paper, we build on several existing …
training data combined with dialectal variation. In this paper, we build on several existing …
Bridging the Gap: Transfer Learning from English PLMs to Malaysian English
Malaysian English is a low resource creole languages, where it carries the elements of
Malay, Chinese, and Tamil languages, in addition to Standard English. Named Entity …
Malay, Chinese, and Tamil languages, in addition to Standard English. Named Entity …
Bridging the Gap: Transfer Learning from English PLMs to Malaysian English
Malaysian English is a low resource creole language, where it carries the elements of
Malay, Chinese, and Tamil languages, in addition to Standard English. Named Entity …
Malay, Chinese, and Tamil languages, in addition to Standard English. Named Entity …
Fine-tuning the SwissBERT Encoder Model for Embedding Sentences and Documents
J Grosjean, J Vamvas - arXiv preprint arXiv:2405.07513, 2024 - arxiv.org
Encoder models trained for the embedding of sentences or short documents have proven
useful for tasks such as semantic search and topic modeling. In this paper, we present a …
useful for tasks such as semantic search and topic modeling. In this paper, we present a …
Swissdox@ LiRI–a large database of media articles made accessible to researchers
J Graën, I Mustac, N Rajovic, J Schaber… - CLARIN annual …, 2023 - zora.uzh.ch
This article presents our efforts to make a large collection of Swiss newspaper articles
available for research purposes. We describe the resource, detail the concept of financing …
available for research purposes. We describe the resource, detail the concept of financing …
[PDF][PDF] Introducing embed2discover: A tool for semi-automated, dictionary-based content-analysis
L Brandenberger, O Bakhteev, JM Fernandez… - files.osf.io
We introduce embed2discover, a new tool for dictionary-based content analysis. The tool
combines state-of-the-art machine learning and language model methodologies with …
combines state-of-the-art machine learning and language model methodologies with …
[PDF][PDF] Swissdox@ LiRI–a large database of media articles made accessible to researchers
J Schaber, J Graën, I Mustač, N Rajović, G Schneider… - clarin.eu
The 'Schweizer Mediendatenbank AG'(SMD) is a nonprofit joint venture of three big Swiss
media groups with the purpose of collecting print and online publications, as well as TV …
media groups with the purpose of collecting print and online publications, as well as TV …