Maybe SFT for better Chat ability or RLHF or fuction call soon?
Hi, this model has answered my tricky question correctly like no other 34B models can (they would assume 1010.A.D is a future time)
BUT, I do not like its output formatting, at least Yi-34B does not follow my "step by step" reasoning instructions.
I do like its tone, but still find it is not Human-Perferred like other RLHF-ed models
BTW, the testing env is
Latest textgen-webui
Latest exllamav2
TheBloke/Yi-34B-GPTQ
Also looking forwarding seeing future progress for function calling abilities.
Dudes, it is just the most essential part for recent released models to catch up with GPTs
This is base model, so I am not sure why you are expecting it to have behavior expected from chat-instruct models. 01.ai team said that they are working on chat fine-tune, it might give that assistant-like vibe. Having base pre-trained models which are not RHLFed is essential to allow later customization like RHLF. Yi model architecture make it a GPT, OpenAI doesn't have monopoly on that word.
Now there are atucally some sft chat model outthere, this one is from https://huggingface.co/TheBloke/Nous-Capybara-34B-GGUF
with textgen-webui
MODEL=Nous-Capybara-Yi-34B-200K-GPTQ
python server.py --model $MODEL \
--loader exllamav2 \
--max_seq_len 8192 \
The Englisht ability is way way ahead of any open source model I have seen (forgive my ignorance!), it is so prudent and high-intellegent. Though there is a weired ending token </s>
but probably is due by prompt template not fully supported
I have the issue with < /s> token being printed at the end of the reply when running my own qlora intune, it's because the dataset is made for llama, where this is the default EOS token, but it's trained on Yi where EOS token is <|endoftext|>. I bet that's the same issue as with Nous Capybara. I haven't tried this fine tune yet. This model is a good base for fine-tuning.
it's because the dataset is made for llama, where this is the default EOS token, but it's trained on Yi where EOS token is <|endoftext|>. I bet that's the same issue as with Nous Capybara.
I also think this is the most likely reason.