--- license: apache-2.0 library_name: transformers datasets: - roneneldan/TinyStories language: - en tags: - custom_code - minGRU - hf_integration --- # MinGRU Sentiment Analysis ![minGRU](minGRU.jpg) First Hugging Face integration of minGRU models from the paper "[**Were RNNs All We Needed?**](https://arxiv.org/abs/2410.01201)". This model uses GPT-2 tokenizer and trained on roneneldan/TinyStories dataset. **Note: This is an experimental model. Don't forget to train model before usage!** Make sure you have installed "[**minGRU-pytorch**](https://github.com/lucidrains/minGRU-pytorch)" library by running "pip install minGRU-pytorch". For modeling and configuration codes: [**minGRU-hf**](https://github.com/suayptalha/minGRU-hf/tree/main) # Training: Training code: ```py def train_model(model, tokenizer, train_data, output_dir, epochs=3, batch_size=16, learning_rate=5e-5, block_size=128): train_dataset = TinyStoriesDataset(train_data, tokenizer, block_size) train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate) scheduler = get_scheduler("linear", optimizer=optimizer, num_warmup_steps=0, num_training_steps=len(train_loader) * epochs) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) model.train() for epoch in range(epochs): print(f"Epoch {epoch + 1}/{epochs}") epoch_loss = 0 progress_bar = tqdm(train_loader, desc="Training") for batch in progress_bar: batch = batch.to(device) outputs = model(batch, labels=batch) loss = outputs.loss optimizer.zero_grad() loss.backward() optimizer.step() scheduler.step() epoch_loss += loss.item() progress_bar.set_postfix(loss=loss.item()) print(f"Epoch {epoch + 1} Loss: {epoch_loss / len(train_loader)}") model.save_pretrained(output_dir, safe_serialization = False) tokenizer.save_pretrained(output_dir) ``` You can use this code snippet for fine-tuning! # Credits: https://arxiv.org/abs/2410.01201 I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers.