#gpt #language-model #training #transformer #machine-learning #neural-network #gpu

bin+lib femto-gpt

最小化 Rust 库,用于训练 GPT 语言模型

2 个不稳定版本

0.2.0 2024年5月28日
0.1.0 2023年6月12日

#62 in 机器学习

Download history 132/week @ 2024-05-25 18/week @ 2024-06-01 4/week @ 2024-06-08

每月 82 次下载

MIT 许可证

470KB
4.5K SLoC

🤖 femtoGPT

crates.io GitHub top language GitHub

femtoGPT 是一个纯 Rust 实现的最小化生成预训练 Transformer。

它可以使用 CPU 和 GPU 对 GPT 风格的语言模型进行推理和训练!

(嘿! 我还在写一本书,很快就会详细讨论 LLM 的实现!在这里查看:超级程序员)

简介

所有内容都是从头开始实现的,包括最小的 GPT 架构的训练/推理代码以及张量处理逻辑。

该架构与 Andrej Karpathy 的 nanoGPT 视频讲座 非常相似/几乎相同。

femtoGPT 是一个很好的起点,对于那些对 LLM 感兴趣并希望深入了解这些模型如何工作的朋友来说。

femtoGPT 仅使用随机生成库(rand/rand-distr)、数据序列化库(serde/bincode 用于保存/加载已训练的模型)和一个并行计算库(rayon)。

femtoGPT 非常慢 相对快速在 CPU 上 😉,并且大多数原始操作(例如矩阵乘法)都以最简单的方式实现。

使用梯度检查方法检查梯度的正确性,尽管仍然有可能某些层实现有误。

(讨论项目的 Discord 服务器!)

使用方法

请确保您系统上安装了 Rust 工具链,以便编译和运行该项目。

curl --proto '=https' --tlsv1.2 -sSfhttps://sh.rustup.rs | sh

如果您想使用 GPU 进行训练,您首先需要确保您的系统上正确安装了 GPU 驱动程序,并且它们的 OpenCL 运行时可用。

在 Debian 系统上,您可以通过安装软件包 ocl-icd-opencl-dev 来设置 OpenCL 运行时。

sudoapt install ocl-icd-opencl-dev

好消息! 由于femtoGPT的GPU实现基于OpenCL,它可以在NVIDIA和AMD显卡上运行,你不需要在系统中安装重量级的CUDA工具包。OpenCL运行时就足够了!

现在你只需要将你想要训练GPT模型的文本放入dataset.txt中。确保它具有少量独特的字符!(例如,当前数据集只使用了65个不同的独特字符!)

然后你需要运行

cargo run --release

它将开始训练模型并将训练数据放在train_data目录中。你可以停止训练并在以后继续!

输出样本

经过在莎士比亚数据库上数小时的训练,在300k参数模型上,这是输出结果

LIS:
Tore hend shater sorerds tougeng an herdofed seng he borind,
Ound ourere sthe, a sou so tousthe ashtherd, m se a man stousshan here hat mend serthe fo witownderstesther s ars at atheno sel theas,
thisth t are sorind bour win soutinds mater horengher

这显然很糟糕,但往好的方面看,它似乎能够生成易于发音的单词。

我目前正在训练一个10M参数模型,以进一步检查我的实现的正确性。

更新:2023年6月5日

这是经过更多时间训练后,在类似规模的模型上的新输出

What like but wore pad wo me che nogns yous dares,
As supt it nind bupart 'the reed:
And hils not es

显然,模型已经开始学习一些单词和标点符号规则!

更新:2023年6月9日

模型能够达到约1.4的损失值

以下是一个示例输出

Adistition gone; true; schistoes for mine souls!
Before your home, bariechts should be
Carlam on that's a worf quirer of him so.
What look'd lack away more
To him foot; one hour fortious of saves:
Son;
'Tis all Earl mmistling me.

HARSARTIO:
Why, idless of my mocks fan that were percious.
Having I will thou should and the muour ne'er shor
To purple, when duke thy are out him.
But will bid you doth remember nature.
Even OF hencomey, carniffeit, I joy
Warming; my deed, but it on of mortard,

更新:2023年6月10日,AGI的火花?哈哈

在一个32核CPU上,大约花费了10小时,使用sentencepiece标记化器,词汇量为500,从Reddit收集的数据集上训练了一个5层模型

提示 "我认为"

I think it s not a scientific method of atheism
I think it s not a fairly simple possible to have a
I think that s a possible point, but I m not sure if you re not
I think that s a great idea, but I m not sure how these are you talking
I think it s a simple thing to do, but I m not sure how thes
I think that s a good thing, but I don t think that s what I mean, I think I
I think that s the only way to be able to use it to be in the same way, but
I think that the police are relatively relatively relative
I think that s a simple moron is a simple character
I think that s the only way to go to the second side of the pol

提示 "生活是"

Life is not a good idea, but it s not a perfection of the opp
Life is a good thing to say that I don t know what you re talking about, but I don
Life is not the same as a religion, but I m not sure if you re a
Life is a perfectly good job of arguing that you are alm
Life is a perfectly good job of the opposite of the f
Life is a fundamentalist, and then I m not sure how the h
Life is not a good idea, and it s not a perfectly good job, but I
Life is not the same as atheists, but that s the only way to be ac
Life is a bit of a single one of these industry is a f
Life is a good idea to get the opposite of the police offic

提示 "如此悲伤,因为"

So sad that you can tell you what? I think I ve been using it on the scre
So sad that I don t know about it, but I don t think I m not afraid to
So sad that I m not sure if you re not arguing with the fact that you
So sad that I was involved in the future, and I have a few we
So sad that s what I said, I m sure you are almost everything you
So sad that you can do it, and I don t think that the fact that it s a po
So sad that I m not sure if you re arguing with the fact that they are
So sad that s the one too much time, but I m not sure if you re arg
So sad that you are sadly supposed to be a big deal in the world
So sad that I don t know about this, but I m not sure how you can do it, but

更新:2023年6月29日

在实现GPU训练器之后,我们能够训练更大的模型。以下是来自在TinyStories数据集上,词汇量为1000的8层8头128嵌入度模型的一些样本

Once upon a time, there was a little girl named Lily.
She loved to play with her toys and she had a lot of fun.
One day, Lily saw a big chicky playing with her toys.
She asked her mom, "Can I play with her toys?" Her mom said,
"Sure, Lily. But we have to clean the pales. Let's suet some candy, Lily."
Lily nodded and went to her mom. They played with the mots and staugning her toys.  
Once upon a time, there was a little girl named Lily.
She loved to play outside and explore. One day, she found a jung on the ground.
She picked it up and tecked it. She ran around and saw it. She was very sad.
She asked her mom for her mom. Her mom said, "Lily, I'm going to find it!" Lily said.
She ran to the slock and took her to the teplace. She went to the park and found a molla.
There was a boy named Tim. Tim loved to play with his toys.
One day, Tim's mom came to the park. Tim saw a big, red ball and wanted to play with it.
Tim wanted to play with the ball. Tim was very excited. He wanted to play with the ball.
But the ball was too fast. Tim wanted to play with the ball. But the ball was too fast.
Tim tried to catch it, but it was too fast. Tim was sad. He tried to run away,
but he did not want to play. Tim was sad. He did not want to play with the ball.

依赖关系

~2.8–4MB
~83K SLoC