|
--- |
|
license: apache-2.0 |
|
library_name: transformers |
|
datasets: |
|
- roneneldan/TinyStories |
|
language: |
|
- en |
|
tags: |
|
- custom_code |
|
- minGRU |
|
- hf_integration |
|
--- |
|
|
|
# MinGRU Sentiment Analysis |
|
|
|
![minGRU](minGRU.jpg) |
|
|
|
First Hugging Face integration of minGRU models from the paper "[**Were RNNs All We Needed?**](https://arxiv.org/abs/2410.01201)". |
|
|
|
This model uses GPT-2 tokenizer and trained on roneneldan/TinyStories dataset. |
|
|
|
**Note: This is an experimental model. Don't forget to train model before usage!** |
|
|
|
Make sure you have installed "[**minGRU-pytorch**](https://github.com/lucidrains/minGRU-pytorch)" library by running "pip install minGRU-pytorch". |
|
|
|
For modeling and configuration codes: [**minGRU-hf**](https://github.com/suayptalha/minGRU-hf/tree/main) |
|
|
|
# Training: |
|
|
|
Training code: |
|
|
|
```py |
|
def train_model(model, tokenizer, train_data, output_dir, epochs=3, batch_size=16, learning_rate=5e-5, block_size=128): |
|
train_dataset = TinyStoriesDataset(train_data, tokenizer, block_size) |
|
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) |
|
|
|
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate) |
|
scheduler = get_scheduler("linear", optimizer=optimizer, num_warmup_steps=0, num_training_steps=len(train_loader) * epochs) |
|
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
model.to(device) |
|
|
|
model.train() |
|
for epoch in range(epochs): |
|
print(f"Epoch {epoch + 1}/{epochs}") |
|
epoch_loss = 0 |
|
progress_bar = tqdm(train_loader, desc="Training") |
|
for batch in progress_bar: |
|
batch = batch.to(device) |
|
|
|
outputs = model(batch, labels=batch) |
|
loss = outputs.loss |
|
|
|
optimizer.zero_grad() |
|
loss.backward() |
|
optimizer.step() |
|
scheduler.step() |
|
|
|
epoch_loss += loss.item() |
|
progress_bar.set_postfix(loss=loss.item()) |
|
|
|
print(f"Epoch {epoch + 1} Loss: {epoch_loss / len(train_loader)}") |
|
|
|
model.save_pretrained(output_dir, safe_serialization = False) |
|
tokenizer.save_pretrained(output_dir) |
|
``` |
|
|
|
You can use this code snippet for fine-tuning! |
|
|
|
# Credits: |
|
|
|
https://arxiv.org/abs/2410.01201 |
|
|
|
I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers. |