inarikami/DeepSeek-V3-int4-TensorRT

11 days ago

•

Hi!
Could you please do this for the base model?
or say specifically which convert_checkpoint.py you used, if you don't feel it's worth it to do it yourself, so I could (try to) do it - i'm having trouble replicating this

Thanks!

inarikami

Owner 11 days ago

•

edited 11 days ago

The team just released a fix for the some of the bugs I saw when converting to bf16 and back: (https://github.com/deepseek-ai/DeepSeek-V3/commit/8f1c9488b53068992f9525fab03b1868e6f7c8c1). I also left out the 3 other tp4 rank files for the int4 inference pipeline. With this fix and my fp8 module patch I'll redo the upload and add the base model.

*yea I had to modify quite a few things to get this to work (nothing worked out of the box, some modules, commit above related, were left in fp8 format) I'd imagine as we get further from initial release things will be cleaned up.

deltanym

11 days ago

thanks! i've been hoping to run the base model and don't quite have enough to run it, but would be able to for probably 6bit and lower
and it seems like no one else is quantizing it! i dunno why, it looks interesting

deltanym

9 days ago

any updates / please share which convert_checkpoint.py you used? / any missing steps?
thanks again!

inarikami
/

DeepSeek-V3-int4-TensorRT

Base model request