Coder1.8-ORPO-TEST / README.md
raincandy-u's picture
Update README.md
4538042 verified
|
raw
history blame
979 Bytes
metadata
license: other
license_name: tongyi-qianwen
license_link: https://huggingface.co/Qwen/Qwen1.5-7B-Chat/blob/main/LICENSE
language:
  - en
pipeline_tag: text-generation
tags:
  - code
datasets:
  - reciprocate/dpo_ultra-capybara-code_filtered-best

Coder1.8-ORPO-TEST

Model Description

Test model for ORPO finetune method, trained on ~20k code examples for 1 epoch on 2 x A40 cards with 4-bit QLora (lora rank=lora alpha=16).

Disclaimer

This is a test model and may generate incorrect responses. Use at your own risk.

Train Details

  • Base: Qwen1.5-1.8B
  • Training Data: ~20k code examples
  • Epochs: 1
  • Method: ORPO
  • Hardware: 2 x A40
  • Quantization: 4-bit QLora
  • Lora Rank/Alpha: 16

Limitations

Limited training data and quantization may impact performance.

Join the Discussion

Have questions or feedback? Join our Discord server Here.