Fine-tuning roadmap

#18

by RonanMcGovern - opened 4 days ago

4 days ago

What fine-tuning library is likely to first be able to support deepseek v3?

Transformers did not have v2 integrated.

MoE might take work, latent attention, and MTP. Then also supporting fp8 as a base model on which to train Loras…

Thanks, and thanks for the model.

4 days ago

I also have the same question, how to fine-tune DeepSeek-V3 ? Could a guide be provided?

hu-po

1 day ago

+1 for finetuning script

Ooghry

1 day ago

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment