Create README.md
Browse filesGenerative Pretrained Transformer(GPT) model with 124,4 million parameters, trained from scratch using a custom dataset gathering 200 short stories from Anton Chekhov.
CONFIG = {
"vocab_size": 50257,
"context_length": 1024,
"emb_dim": 768,
"n_heads": 12,
"n_layers": 12,
"drop_rate": 0.1,
"qkv_bias": False
}