suayptalha commited on
Commit
eb4e607
·
verified ·
1 Parent(s): b817b15

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -1
README.md CHANGED
@@ -1,4 +1,68 @@
1
  ---
2
  license: apache-2.0
3
  library_name: transformers
4
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  library_name: transformers
4
+ datasets:
5
+ - roneneldan/TinyStories
6
+ ---
7
+
8
+ # MinGRU Sentiment Analysis
9
+
10
+ ![minGRU](minGRU.jpg)
11
+
12
+ First Hugging Face integration of minGRU models from the paper "[**Were RNNs All We Needed?**](https://arxiv.org/abs/2410.01201)".
13
+
14
+ This model uses GPT-2 tokenizer and trained on roneneldan/TinyStories dataset.
15
+
16
+ **Note: This is an experimental model. Don't forget to train model before usage!**
17
+
18
+ Make sure you install "[**minGRU-pytorch**](https://github.com/lucidrains/minGRU-pytorch)" library by running "pip install minGRU-pytorch".
19
+
20
+ For modeling and configuration codes: [**minGRU-hf**](https://github.com/suayptalha/minGRU-hf/tree/main)
21
+
22
+ # Training:
23
+
24
+ Training code:
25
+
26
+ ```py
27
+ def train_model(model, tokenizer, train_data, output_dir, epochs=3, batch_size=16, learning_rate=5e-5, block_size=128):
28
+ train_dataset = TinyStoriesDataset(train_data, tokenizer, block_size)
29
+ train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
30
+
31
+ optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate)
32
+ scheduler = get_scheduler("linear", optimizer=optimizer, num_warmup_steps=0, num_training_steps=len(train_loader) * epochs)
33
+
34
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
35
+ model.to(device)
36
+
37
+ model.train()
38
+ for epoch in range(epochs):
39
+ print(f"Epoch {epoch + 1}/{epochs}")
40
+ epoch_loss = 0
41
+ progress_bar = tqdm(train_loader, desc="Training")
42
+ for batch in progress_bar:
43
+ batch = batch.to(device)
44
+
45
+ outputs = model(batch, labels=batch)
46
+ loss = outputs.loss
47
+
48
+ optimizer.zero_grad()
49
+ loss.backward()
50
+ optimizer.step()
51
+ scheduler.step()
52
+
53
+ epoch_loss += loss.item()
54
+ progress_bar.set_postfix(loss=loss.item())
55
+
56
+ print(f"Epoch {epoch + 1} Loss: {epoch_loss / len(train_loader)}")
57
+
58
+ model.save_pretrained(output_dir)
59
+ tokenizer.save_pretrained(output_dir)
60
+ ```
61
+
62
+ You can use this code snippet for fine-tuning!
63
+
64
+ # Credits:
65
+
66
+ https://arxiv.org/abs/2410.01201
67
+
68
+ I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers.