Issues with config file
#1
by
Authentic1957
- opened
In config.json
rms_norm
is set to True, but the pre-training model contains bias parameters such as backbone.layers.1.norm.bias
, so it is necessary to set rms_norm
to False. Also, in config_mamba.py
, MambaConfig
does not contain the pad_id
member variable, so the "pad_id": 0
needs to be removed.
Note: All changes are made in mamba-ssm==1.2.0
. After that you can load the pre-trained model and run it without any problem.
Also, the input_id
line in the example code may better be changed to:
input_ids = torch.from_numpy(text_byte[None, :].copy()).long().cuda()
It is a great paper! Thanks to all the authors for this wonderful research.
Thanks a lot! That really helps.