Why is this model the exact same as Llama 2 Chat 13B?
I was asked to test your model, so I tested TheBloke's q5_K_M quantized version of h2ogpt-4096-llama2-13B-chat-GGML, and I've noticed it's giving 100 % identical responses as Llama 2 Chat 13B (same q5_K_M quant)!
Since I'm using deterministic settings, same input results in same output if all other variables are the exact same. But this is supposedly a different model, I even checked its checksum to make sure it wasn't just a renamed version of the original Llama 2 Chat model.
The same applies no matter if I use Llama 2 Chat's prompt format or your H2O format as mentioned on TheBloke's model card.
You didn't just take Meta's Llama 2 Chat model and renamed it h2oGPT without making any changes, did you? You write "h2oGPT fine-tuned model based on Meta's Llama 2 13B Chat.", but shouldn't fine-tunes be based on the base model?
Just wondering if your model is a 1:1 copy of meta's original? Or if there was a mistake/mix-up when the model was quantized/uploaded? (I did check my files again and again, making sure I didn't mix them up locally!)
yes, it's exactly the same as https://huggingface.co/meta-llama/Llama-2-13b-chat-hf or https://huggingface.co/TheBloke/Llama-2-13B-Chat-fp16, just making it easier for potential users of h2oGPT (what's demoed on http://gpt.h2o.ai) to get access to the models, the same Meta license still applies.
Yes, we are fine-tuning the non-chat base models. I'll improve the description. Thanks!