falcon-7b-sharded-bf16-finetuned-html-code-generation

This model is a fine-tuned version of ybelkada/falcon-7b-sharded-bf16 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7322

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 320

Training results

Training Loss Epoch Step Validation Loss
No log 0.1794 20 1.8071
No log 0.3587 40 1.4823
No log 0.5381 60 1.3637
No log 0.7175 80 1.2700
No log 0.8969 100 1.2054
No log 1.0762 120 1.1352
No log 1.2556 140 1.1297
No log 1.4350 160 1.0126
No log 1.6143 180 0.9738
No log 1.7937 200 0.9058
No log 1.9731 220 0.8581
No log 2.1525 240 0.7948
No log 2.3318 260 0.7601
No log 2.5112 280 0.7397
No log 2.6906 300 0.7332
No log 2.8700 320 0.7322

Framework versions

  • PEFT 0.12.1.dev0
  • Transformers 4.43.3
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
19
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for kasperius/falcon-7b-sharded-bf16-finetuned-html-code-generation

Adapter
(131)
this model