---
base_model: openai-community/gpt2-large
library_name: peft
---
# Model Card for Model ID

### Summary

<!-- Provide a quick summary of what the model is/does. -->

This is the adapter of a fine-tuned model with PEFT LoRA for text summarization based on GPT-2 (large). It has been finetuned on the filtered version of TL;DR train dataset, which can be found and downloaded from here: [https://github.com/openai/summarize-from-feedback](https://github.com/openai/summarize-from-feedback).

### Model Description

<!-- Provide a longer summary of what this model is. -->

- **Developed by:** Course Organizers
- **Finetuned from model:** openai-community/gpt2-large

### Training Details

This model has been trained using the TLR library and SFTTrainer class from Huggingface.

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

The filtered version of TL;DR train dataset, which can be found and downloaded from here: [https://openaipublic.blob.core.windows.net/summarize-from-feedback/datasets/tldr_3_filtered/train.jsonl](https://openaipublic.blob.core.windows.net/summarize-from-feedback/datasets/tldr_3_filtered/train.jsonl).

#### Training Hyperparameters

The following hyperparameters were used during training:

 - learning_rate: 1e-05
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 2024
 - distributed_type: multi-GPU
 - num_devices: 8
 - gradient_accumulation_steps: 1
 - total_train_batch_size: 64
 - total_eval_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - num_epochs: 1

LoRA parameters:

 - lora_r: 256
 - lora_alpha: 64
 - lora_dropout: 0.1
 - lora_target_modules:None # If this is not specified, modules will be chosen according to the model architecture. 

### Framework Versions

 - accelerate==0.26.1
 - datasets==2.16.1
 - transformers==4.45.2
 - trl==0.11.2
 - peft==0.8.2

### Compute Infrastructure and Hardware

Slurm cluster with 8 x H100 Nvidia GPUs.