nanogpt-speedrun / README.md
lemonteaa's picture
Update README.md
0428664 verified
---
datasets:
- HuggingFaceFW/fineweb
base_model:
- openai-community/gpt2
---
# NanoGPT Speedrun
Following https://github.com/KellerJordan/modded-nanogpt for fun (learning).
## Run Info
**baseline/**
- Run on lightning cloud, using one L40S
- Batch size set to 32
- VRAM usage: 26.95GB (25698MB reported in `nvidia-smi`)
- 4 seconds per step, total 3200 steps
- Checkpoint saved every 320 steps
## Training loss
To experimentally check the neural scaling law:
![baseline/analysis/loss_plot2.png](baseline/analysis/loss_plot2.png)
(Fitted line: `log y = -0.11 * log x + 0.9` where x is step (0 to 3200) and y is the training loss)
## Demo
Available at https://huggingface.co/spaces/lemonteaa/nanogpt-speedrun-demo
(WIP)