Update README.md
Browse files
README.md
CHANGED
@@ -26,7 +26,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
26 |
|
27 |
# gqa
|
28 |
|
29 |
-
This model is a fine-tuned version of [unc-nlp/lxmert-base-uncased](https://huggingface.co/unc-nlp/lxmert-base-uncased) on the Graphcore/gqa-lxmert dataset.
|
30 |
It achieves the following results on the evaluation set:
|
31 |
- Loss: 1.9326
|
32 |
- Accuracy: 0.5934
|
@@ -41,10 +41,38 @@ More information needed
|
|
41 |
|
42 |
## Training and evaluation data
|
43 |
|
44 |
-
|
45 |
|
46 |
## Training procedure
|
47 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
### Training hyperparameters
|
49 |
|
50 |
The following hyperparameters were used during training:
|
|
|
26 |
|
27 |
# gqa
|
28 |
|
29 |
+
This model is a fine-tuned version of [unc-nlp/lxmert-base-uncased](https://huggingface.co/unc-nlp/lxmert-base-uncased) on the [Graphcore/gqa-lxmert](https://huggingface.co/datasets/Graphcore/gqa-lxmert) dataset.
|
30 |
It achieves the following results on the evaluation set:
|
31 |
- Loss: 1.9326
|
32 |
- Accuracy: 0.5934
|
|
|
41 |
|
42 |
## Training and evaluation data
|
43 |
|
44 |
+
[Graphcore/gqa-lxmert](https://huggingface.co/datasets/Graphcore/gqa-lxmert) dataset
|
45 |
|
46 |
## Training procedure
|
47 |
|
48 |
+
Trained on 16 Graphcore Mk2 IPUs using [optimum-graphcore](https://github.com/huggingface/optimum-graphcore).
|
49 |
+
|
50 |
+
Command line:
|
51 |
+
|
52 |
+
```
|
53 |
+
python examples/question-answering/run_vqa.py \
|
54 |
+
--model_name_or_path unc-nlp/lxmert-base-uncased \
|
55 |
+
--ipu_config_name Graphcore/lxmert-base-ipu \
|
56 |
+
--dataset_name Graphcore/gqa-lxmert \
|
57 |
+
--do_train \
|
58 |
+
--do_eval \
|
59 |
+
--max_seq_length 512 \
|
60 |
+
--per_device_train_batch_size 1 \
|
61 |
+
--num_train_epochs 4 \
|
62 |
+
--dataloader_num_workers 64 \
|
63 |
+
--logging_steps 5 \
|
64 |
+
--learning_rate 1e-5 \
|
65 |
+
--lr_scheduler_type linear \
|
66 |
+
--loss_scaling 16384 \
|
67 |
+
--weight_decay 0.01 \
|
68 |
+
--warmup_ratio 0.1 \
|
69 |
+
--output_dir /tmp/gqa/ \
|
70 |
+
--dataloader_drop_last \
|
71 |
+
--replace_qa_head \
|
72 |
+
--pod_type pod16
|
73 |
+
|
74 |
+
```
|
75 |
+
|
76 |
### Training hyperparameters
|
77 |
|
78 |
The following hyperparameters were used during training:
|