Distillm: Towards streamlined distillation for large language models
Knowledge distillation (KD) is widely used for compressing a teacher model to a smaller
student model, reducing its inference cost and memory footprint while preserving model …
student model, reducing its inference cost and memory footprint while preserving model …
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
Adapting state-of-the-art Large Language Models (LLMs) like GPT-4 and Gemini for specific
tasks is challenging. Due to the opacity in their parameters, embeddings, and even output …
tasks is challenging. Due to the opacity in their parameters, embeddings, and even output …
Efficient fine-tuning large language models for knowledge-aware response planning
Abstract Large Language Models (LLMs) have shown impressive emergent language
capabilities, especially in applications with high ambiguity, such as language reasoning and …
capabilities, especially in applications with high ambiguity, such as language reasoning and …
CollectiveSFT: Scaling Large Language Models for Chinese Medical Benchmark with Collective Instructions in Healthcare
The rapid progress in Large Language Models (LLMs) has prompted the creation of
numerous benchmarks to evaluate their capabilities. This study focuses on the …
numerous benchmarks to evaluate their capabilities. This study focuses on the …