zouharvi commited on
Commit
c28304b
·
1 Parent(s): ef1dabe

add hparams

Browse files
Files changed (2) hide show
  1. README.md +4 -2
  2. hparams.yaml +28 -0
README.md CHANGED
@@ -100,9 +100,11 @@ base_model:
100
  - FacebookAI/xlm-roberta-large
101
  ---
102
 
 
 
103
  This is a source-only COMET model used for efficient evaluation subset selection.
104
- It is not compatible with the upstream [github.com/Unbabel/COMET/](https://github.com/Unbabel/COMET/) and to run it you have to install [github.com/zouharvi/comet-src](https://github.com/zouharvi/comet-src).
105
 
106
  The primary use of this model is from the [subset2evaluate](https://github.com/zouharvi/subset2evaluate) package.
107
 
108
- Further instructions and notes TODO.
 
100
  - FacebookAI/xlm-roberta-large
101
  ---
102
 
103
+ # PreCOMET-diversity
104
+
105
  This is a source-only COMET model used for efficient evaluation subset selection.
106
+ It is not compatible with the upstream [github.com/Unbabel/COMET/](https://github.com/Unbabel/COMET/) and to run it you have to install [github.com/zouharvi/PreCOMET](https://github.com/zouharvi/PreCOMET).
107
 
108
  The primary use of this model is from the [subset2evaluate](https://github.com/zouharvi/subset2evaluate) package.
109
 
110
+ Further description TODO.
hparams.yaml ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ nr_frozen_epochs: 0.3
2
+ keep_embeddings_frozen: true
3
+ optimizer: AdamW
4
+ warmup_steps: 0
5
+ encoder_learning_rate: 1.0e-06
6
+ learning_rate: 1.5e-05
7
+ layerwise_decay: 0.95
8
+ encoder_model: XLM-RoBERTa
9
+ pretrained_model: xlm-roberta-large
10
+ pool: avg
11
+ layer: mix
12
+ layer_transformation: sparsemax
13
+ layer_norm: false
14
+ loss: mse
15
+ dropout: 0.1
16
+ batch_size: 16
17
+ train_data:
18
+ - data/csv/train_div.csv
19
+ validation_data:
20
+ - data/csv/dev_div.csv
21
+ class_identifier: hypothesisless_regression_metric
22
+ load_pretrained_weights: true
23
+ local_files_only: false
24
+ hidden_sizes:
25
+ - 2048
26
+ - 1024
27
+ activations: Tanh
28
+ final_activation: null