A survey of resource-efficient llm and multimodal foundation models

M Xu, W Yin, D Cai, R Yi, D Xu, Q Wang, B Wu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large foundation models, including large language models (LLMs), vision transformers
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …

ChatGPT vs. Bard: a comparative study

I Ahmed, A Roy, M Kajol, U Hasan, PP Datta… - Authorea …, 2023 - authorea.com
The rapid progress in conversational AI has given rise to advanced language models
capable of generating human-like texts. Among these models, ChatGPT and Bard …

Towards General Industrial Intelligence: A Survey on IIoT-Enhanced Continual Large Models

J Chen, J He, F Chen, Z Lv, J Tang, W Li, Z Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Currently, most applications in the Industrial Internet of Things (IIoT) still rely on CNN-based
neural networks. Although Transformer-based large models (LMs), including language …

Tokenizer Choice For LLM Training: Negligible or Crucial?

M Ali, M Fromm, K Thellmann, R Rutmann… - arXiv preprint arXiv …, 2023 - arxiv.org
The recent success of LLMs has been predominantly driven by curating the training dataset
composition, scaling of model architectures and dataset sizes and advancements in …

Reproducibility, Replicability, and Insights into Dense Multi-Representation Retrieval Models: from ColBERT to Col

X Wang, C Macdonald, N Tonellotto… - Proceedings of the 46th …, 2023 - dl.acm.org
Dense multi-representation retrieval models, exemplified as ColBERT, estimate the
relevance between a query and a document based on the similarity of their contextualised …

Hints on the data for language modeling of synthetic languages with transformers

R Zevallos, N Bel - Proceedings of the 61st Annual Meeting of the …, 2023 - aclanthology.org
Abstract Language Models (LM) are becoming more and more useful for providing
representations upon which to train Natural Language Processing applications. However …

BioBERTurk: Exploring Turkish Biomedical Language Model Development Strategies in Low-Resource Setting

H Türkmen, O Dikenelli, C Eraslan, MC Callı… - Journal of Healthcare …, 2023 - Springer
Pretrained language models augmented with in-domain corpora show impressive results in
biomedicine and clinical Natural Language Processing (NLP) tasks in English. However …

Performance Evaluation of Tokenizers in Large Language Models for the Assamese Language

S Tamang, DJ Bora - arXiv preprint arXiv:2410.03718, 2024 - arxiv.org
Training of a tokenizer plays an important role in the performance of deep learning models.
This research aims to understand the performance of tokenizers in five state-of-the-art …

[PDF][PDF] ARC-NLP at CheckThat!-2022: Contradiction for Harmful Tweet Detection.

C Toraman, O Ozcelik, F Sahinuç, U Sahin - CLEF (Working Notes), 2022 - ceur-ws.org
The target task of our team in CLEF2022 CheckThat! Lab challenge is Task-1C, harmful
tweet detection. We propose a novel approach, called ARC-NLP-contra, which is a …

Harnessing the power of BERT in the Turkish clinical domain: pretraining approaches for limited data scenarios

H Türkmen, O Dikenelli, C Eraslan, MC Çallı… - arXiv preprint arXiv …, 2023 - arxiv.org
In recent years, major advancements in natural language processing (NLP) have been
driven by the emergence of large language models (LLMs), which have significantly …