关注
Guilherme Penedo
Guilherme Penedo
ML Research Engineer at 🤗 HuggingFace
在 huggingface.co 的电子邮件经过验证
标题
引用次数
引用次数
年份
The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only
G Penedo, Q Malartic, D Hesslow, R Cojocaru, A Cappelli, H Alobeidli, ...
arXiv preprint arXiv:2306.01116, 2023
5462023
Falcon-40B: an open large language model with state-of-the-art performance
E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, M Debbah, ...
1992023
The falcon series of open language models
E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, M Debbah, ...
arXiv preprint arXiv:2311.16867, 2023
1972023
The falcon series of language models: Towards open frontier models
E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, ...
Hugging Face repository, 2023
312023
The RefinedWeb dataset for Falcon LLM: Outperforming curated corpora with web data only
G Penedo, Q Malartic, D Hesslow, R Cojocaru, H Alobeidli, A Cappelli, ...
Advances in Neural Information Processing Systems 36, 79155-79172, 2023
302023
The refinedweb dataset for falcon llm: Outperforming curated corpora with web data, and web data only. arXiv 2023
G Penedo, Q Malartic, D Hesslow, R Cojocaru, A Cappelli, H Alobeidli, ...
arXiv preprint arXiv:2306.01116, 0
24
Falcon-40B: an open large language model with state-of-the-art performance. 2023
E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, M Debbah, ...
URL https://falconllm. tii. ae, 2023
102023
The refinedweb dataset for falcon llm: Outperforming curated corpora with web data only
G Penedo, Q Malartic, D Hesslow, R Cojocaru, H Alobeidli, A Cappelli, ...
Advances in Neural Information Processing Systems 36, 2024
82024
The Falcon Series of Open Language Models.(2023)
E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, M Debbah, ...
arXiv preprint arXiv:2311.16867, 2023
82023
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only.” arXiv
G Penedo, Q Malartic, D Hesslow, R Cojocaru, A Cappelli, H Alobeidli, ...
arXiv preprint arXiv:2306.01116, 2023
72023
AlGhafa Evaluation Benchmark for Arabic Language Models
E Almazrouei, R Cojocaru, M Baldo, Q Malartic, H Alobeidli, D Mazzotta, ...
Proceedings of ArabicNLP 2023, 244-275, 2023
62023
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
G Penedo, H Kydlíček, A Lozhkov, M Mitchell, C Raffel, L Von Werra, ...
arXiv preprint arXiv:2406.17557, 2024
12024
Artery in Microgravity (AIM): Assembly, integration, and testing for a student payload for the ISS
L García Mozos, D Saroya, Y Roelvink, N Santos D'Amore, S Gabetti, ...
4th Symposium on Space Educational Activities, 2022
2022
系统目前无法执行此操作,请稍后再试。
文章 1–13