suayptalha
/

minGRULM-base

Text Generation

Model card Files Files and versions Community

suayptalha commited on 28 days ago

Commit

eb4e607

·

verified ·

1 Parent(s): b817b15

Update README.md

Files changed (1) hide show

README.md +65 -1

README.md CHANGED Viewed

@@ -1,4 +1,68 @@
 ---
 license: apache-2.0
 library_name: transformers
----

 ---
 license: apache-2.0
 library_name: transformers
+datasets:
+- roneneldan/TinyStories
+---
+# MinGRU Sentiment Analysis
+![minGRU](minGRU.jpg)
+First Hugging Face integration of minGRU models from the paper "[**Were RNNs All We Needed?**](https://arxiv.org/abs/2410.01201)".
+This model uses GPT-2 tokenizer and trained on roneneldan/TinyStories dataset.
+**Note: This is an experimental model. Don't forget to train model before usage!**
+Make sure you install "[**minGRU-pytorch**](https://github.com/lucidrains/minGRU-pytorch)" library by running "pip install minGRU-pytorch".
+For modeling and configuration codes: [**minGRU-hf**](https://github.com/suayptalha/minGRU-hf/tree/main)
+# Training:
+Training code:
+```py
+def train_model(model, tokenizer, train_data, output_dir, epochs=3, batch_size=16, learning_rate=5e-5, block_size=128):
+    train_dataset = TinyStoriesDataset(train_data, tokenizer, block_size)
+    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
+    optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate)
+    scheduler = get_scheduler("linear", optimizer=optimizer, num_warmup_steps=0, num_training_steps=len(train_loader) * epochs)
+    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    model.to(device)
+    model.train()
+    for epoch in range(epochs):
+        print(f"Epoch {epoch + 1}/{epochs}")
+        epoch_loss = 0
+        progress_bar = tqdm(train_loader, desc="Training")
+        for batch in progress_bar:
+            batch = batch.to(device)
+            outputs = model(batch, labels=batch)
+            loss = outputs.loss
+            optimizer.zero_grad()
+            loss.backward()
+            optimizer.step()
+            scheduler.step()
+            epoch_loss += loss.item()
+            progress_bar.set_postfix(loss=loss.item())
+        print(f"Epoch {epoch + 1} Loss: {epoch_loss / len(train_loader)}")
+    model.save_pretrained(output_dir)
+    tokenizer.save_pretrained(output_dir)
+```
+You can use this code snippet for fine-tuning!
+# Credits:
+https://arxiv.org/abs/2410.01201
+I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers.