h1

This model is a fine-tuned version of distilgpt2 on hearthstone dataset. GitHub repo. It achieves the following results on the evaluation set:

  • Loss: 0.0890
  • Exact Match: 0.1970
  • Bleu: 0.9737
  • Codebleu: 0.9172
  • Ngram Match Score: 0.8984
  • Weighted Ngram Match Score: 0.8985
  • Syntax Match Score: 0.9293
  • Dataflow Match Score: 0.9429
  • Chrf: 97.5313

Model description

DistilGPT2 applied onto HearthStone dataset with preprocessing of python code to dumped AST. Example:

#gold labels
Module([ClassDef('Innervate', [Name('SpellCard', Load())], [], [FunctionDef('__init__', arguments([], [arg('self', None, None)], None, [], [], None, []), [Expr(Call(Attribute(Call(Name('super', Load()), [], []), '__init__', Load()), [Constant('Innervate', None), Constant(0, None), Attribute(Name('CHARACTER_CLASS', Load()), 'DRUID', Load()), Attribute(Name('CARD_RARITY', Load()), 'FREE', Load())], []))], [], None, None), FunctionDef('use', arguments([], [arg('self', None, None), arg('player', None, None), arg('game', None, None)], None, [], [], None, []), [Expr(Call(Attribute(Call(Name('super', Load()), [], []), 'use', Load()), [Name('player', Load()), Name('game', Load())], [])), If(Compare(Attribute(Name('player', Load()),'mana', Load()), [Lt()], [Constant(8, None)]), [AugAssign(Attribute(Name('player', Load()),'mana', Store()), Add(), Constant(2, None))], [Assign([Attribute(Name('player', Load()),'mana', Store())], Constant(10, None), None)])], [], None, None)], [])], [])
#wrong prediction (example of error after training)
Module([ClassDef('Innervate', [Name('SpellCard', Load())], [], [FunctionDef('__init__', arguments([], [arg('self', None, None)], None, [], [], None, []), [Expr(Call(Attribute(Call(Name('super', Load()), [], []), '__init__', Load()), [Constant('Innervate', None), Constant(0, None), Attribute(Name('CHARACTER_CLASS', Load()), 'DRUID', Load()), Attribute(Name('CARD_RARITY', Load()), 'FREE', Load())], []))], [], None, None), FunctionDef('use', arguments([], [arg('self', None, None), arg('player', None, None), arg('game', None, None)], None, [], [], None, []), [Expr(Call(Attribute(Call(Name('super', Load()), [], []), 'use', Load()), [Name('player', Load()), Name('game', Load())], [])), For(Compare(Attribute(Name('player', Load()),'maxa', Load()), [Lt()], [Constant(10, None)]), [AugAssign(Attribute(Name('player', Load()),'mana', Store()), Add(), Constant(2, None))], Exign([Name(Name('player', Load()),'mana', Store())], Constant(None, None), None)],], [], None, None)], [])], [])

Intended uses & limitations

HearthStone card code synthesis.

Training and evaluation data

See split of hearthstone dataset

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 17
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Exact Match Bleu Codebleu Ngram Match Score Weighted Ngram Match Score Syntax Match Score Dataflow Match Score Chrf
0.3871 11.94 1600 0.1043 0.0152 0.9499 0.8549 0.8089 0.8089 0.8653 0.9366 95.4674
0.0752 23.88 3200 0.0784 0.1212 0.9640 0.8874 0.8525 0.8526 0.8929 0.9516 96.7978
0.0448 35.82 4800 0.0717 0.1364 0.9693 0.9077 0.8782 0.8782 0.9069 0.9674 97.2100
0.0308 47.76 6400 0.0752 0.1364 0.9702 0.9061 0.8808 0.8810 0.9070 0.9554 97.1896
0.0223 59.7 8000 0.0762 0.1364 0.9724 0.9050 0.8877 0.8881 0.9093 0.9348 97.4616
0.0166 71.64 9600 0.0762 0.1667 0.9733 0.9140 0.8948 0.8951 0.9197 0.9461 97.4945
0.0128 83.58 11200 0.0793 0.1515 0.9728 0.9085 0.8911 0.8918 0.9189 0.9321 97.4152
0.0104 95.52 12800 0.0822 0.1667 0.9732 0.9165 0.8946 0.8950 0.9222 0.9541 97.4887
0.0084 107.46 14400 0.0832 0.1667 0.9737 0.9167 0.8970 0.8972 0.9254 0.9471 97.5326
0.007 119.4 16000 0.0837 0.1818 0.9743 0.9160 0.8983 0.8986 0.9238 0.9434 97.6638
0.0058 131.34 17600 0.0858 0.1818 0.9739 0.9200 0.8977 0.8977 0.9267 0.9579 97.5583
0.005 143.28 19200 0.0878 0.1818 0.9743 0.9180 0.8993 0.9001 0.9301 0.9426 97.5819
0.0044 155.22 20800 0.0877 0.1667 0.9736 0.9156 0.8957 0.8960 0.9278 0.9429 97.5109
0.0042 167.16 22400 0.0890 0.1970 0.9736 0.9171 0.8984 0.8984 0.9293 0.9424 97.5617
0.0038 179.1 24000 0.0891 0.2121 0.9738 0.9174 0.8991 0.8991 0.9285 0.9429 97.5452
0.0037 191.04 25600 0.0890 0.1970 0.9737 0.9172 0.8984 0.8985 0.9293 0.9429 97.5313

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.13.0
  • Datasets 2.6.1
  • Tokenizers 0.13.1
Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dvitel/h1

Evaluation results