Qwen-7B-Chat-Int4 / configuration_qwen.py

Commit History

Add ApplyRoPE and RMSNorm kernels written in OpenAI Triton.
1b59f63

wangzihan99 commited on

softmax_in_fp32
682f4da

yangapku commited on

update kvcache
0e3568a

yangapku commited on

update model
ff5200f

yangapku commited on