Graphcore
/

lxmert-gqa-uncased

Question Answering

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Jinchen commited on Mar 26, 2022

Commit

2619dd5

·

1 Parent(s): 7408dc9

Update README.md

Files changed (1) hide show

README.md +30 -2

README.md CHANGED Viewed

@@ -26,7 +26,7 @@ should probably proofread and complete it, then remove this comment. -->
 # gqa
-This model is a fine-tuned version of [unc-nlp/lxmert-base-uncased](https://huggingface.co/unc-nlp/lxmert-base-uncased) on the Graphcore/gqa-lxmert dataset.
 It achieves the following results on the evaluation set:
 - Loss: 1.9326
 - Accuracy: 0.5934
@@ -41,10 +41,38 @@ More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:

 # gqa
+This model is a fine-tuned version of [unc-nlp/lxmert-base-uncased](https://huggingface.co/unc-nlp/lxmert-base-uncased) on the [Graphcore/gqa-lxmert](https://huggingface.co/datasets/Graphcore/gqa-lxmert) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 1.9326
 - Accuracy: 0.5934
 ## Training and evaluation data
+[Graphcore/gqa-lxmert](https://huggingface.co/datasets/Graphcore/gqa-lxmert) dataset
 ## Training procedure
+Trained on 16 Graphcore Mk2 IPUs using [optimum-graphcore](https://github.com/huggingface/optimum-graphcore).
+Command line:
+```
+python examples/question-answering/run_vqa.py \
+  --model_name_or_path unc-nlp/lxmert-base-uncased \
+  --ipu_config_name Graphcore/lxmert-base-ipu \
+  --dataset_name Graphcore/gqa-lxmert \
+  --do_train \
+  --do_eval \
+  --max_seq_length 512 \
+  --per_device_train_batch_size 1 \
+  --num_train_epochs 4 \
+  --dataloader_num_workers 64 \
+  --logging_steps 5 \
+  --learning_rate 1e-5 \
+  --lr_scheduler_type linear \
+  --loss_scaling 16384 \
+  --weight_decay 0.01 \
+  --warmup_ratio 0.1 \
+  --output_dir /tmp/gqa/ \
+  --dataloader_drop_last \
+  --replace_qa_head \
+  --pod_type pod16
+```
 ### Training hyperparameters
 The following hyperparameters were used during training: