lemonteaa
/

nanogpt-speedrun

Model card Files Files and versions Community

nanogpt-speedrun / README.md

lemonteaa's picture

Update README.md

0428664 verified about 2 months ago

|

history blame contribute delete

731 Bytes

	---
	datasets:
	- HuggingFaceFW/fineweb
	base_model:
	- openai-community/gpt2
	---

	# NanoGPT Speedrun

	Following https://github.com/KellerJordan/modded-nanogpt for fun (learning).

	## Run Info

	baseline/

	- Run on lightning cloud, using one L40S
	- Batch size set to 32
	- VRAM usage: 26.95GB (25698MB reported in `nvidia-smi`)
	- 4 seconds per step, total 3200 steps
	- Checkpoint saved every 320 steps

	## Training loss

	To experimentally check the neural scaling law:

	![baseline/analysis/loss_plot2.png](baseline/analysis/loss_plot2.png)

	(Fitted line: `log y = -0.11 * log x + 0.9` where x is step (0 to 3200) and y is the training loss)

	## Demo

	Available at https://huggingface.co/spaces/lemonteaa/nanogpt-speedrun-demo

	(WIP)