SPDF: Sparse pre-training and dense fine-tuning for large language models

[PDF][PDF] Assessing the ineffectiveness of synthetic reinforcement learning feedback in fine-tuning large language models

S Whitmore, C Harrington, E Pritchard - 2024 - osf.io

The rapid evolution of artificial intelligence has brought significant advancements in various
applications, yet fine-tuning models to align outputs with user needs and ethical standards …

被引用次数：40 相关文章所有 4 个版本

Hardware accelerator design for sparse dnn inference and training: A tutorial

W Mao, M Wang, X Xie, X Wu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Deep neural networks (DNNs) are widely used in many fields, such as artificial intelligence
generated content (AIGC) and robotics. To efficiently support these tasks, the model pruning …

被引用次数：5 相关文章

[PDF] arxiv.org

Q-galore: Quantized galore with int4 projection and layer-adaptive low-rank gradients

Z Zhang, A Jaiswal, L Yin, S Liu, J Zhao, Y Tian… - arXiv preprint arXiv …, 2024 - arxiv.org

Training Large Language Models (LLMs) is memory-intensive due to the large number of
parameters and associated optimization states. GaLore, a recent method, reduces memory …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

A Comprehensive Performance Study of Large Language Models on Novel AI Accelerators

M Emani, S Foreman, V Sastry, Z Xie, S Raskar… - arXiv preprint arXiv …, 2023 - arxiv.org

Artificial intelligence (AI) methods have become critical in scientific applications to help
accelerate scientific discovery. Large language models (LLMs) are being considered as a …

被引用次数：11 相关文章所有 3 个版本

[PDF] acm.org

Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision

X Luo, D Liu, H Kong, S Huai, H Chen… - ACM Transactions on …, 2024 - dl.acm.org

Deep neural networks (DNNs) have recently achieved impressive success across a wide
range of real-world vision and language processing tasks, spanning from image …

Enhancing zero-shot crypto sentiment with fine-tuned language model and prompt engineering

RSM Wahidur, I Tashdeed, M Kaur, HN Lee - IEEE Access, 2024 - ieeexplore.ieee.org

Blockchain technology has revolutionized the financial landscape, witnessing widespread
adoption of cryptocurrencies due to their decentralized and transparent nature. As …

被引用次数：16 相关文章所有 4 个版本

[PDF] arxiv.org

Sparsity-Accelerated Training for Large Language Models

D Ma, L Chen, P Wang, H Xu, H Li, L Sun, S Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) have demonstrated proficiency across various natural
language processing (NLP) tasks but often require additional training, such as continual pre …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs

M Mozaffari, A Yazdanbakhsh, Z Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

We propose SLoPe, a Double-Pruned Sparse Plus Lazy Low-rank Adapter Pretraining
method for LLMs that improves the accuracy of sparse LLMs while accelerating their …

被引用次数：3 相关文章所有 2 个版本

[PDF] aaai.org

Deciphering Compatibility Relationships with Textual Descriptions via Extraction and Explanation

Y Wang, Z He, Z He, H Xu, J McAuley - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Understanding and accurately explaining compatibility relationships between fashion items
is a challenging problem in the burgeoning domain of AI-driven outfit recommendations …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

ProMoAI: Process Modeling with Generative AI

H Kourani, A Berti, D Schuster… - arXiv preprint arXiv …, 2024 - arxiv.org

ProMoAI is a novel tool that leverages Large Language Models (LLMs) to automatically
generate process models from textual descriptions, incorporating advanced prompt …

被引用次数：7 相关文章所有 2 个版本