Qwen-7B-Chat-Int4 / configuration_qwen.py

Commit History

Add ApplyRoPE and RMSNorm kernels written in OpenAI Triton.
af64202

Shangming Cai commited on

softmax_in_fp32
682f4da

yangapku commited on

update kvcache
0e3568a

yangapku commited on

update model
ff5200f

yangapku commited on