--- license: llama2 --- The verifier model (`/llama7b-2-ep2-n100-scahead-mse-lm-token`) and the generator model (`/llama7b-2-ep2`) in GSM8K, finetuned from Llama2-7B. See the Mistral-7B version in [OVM-Mistral-7b](https://huggingface.co/FreedomIntelligence/OVM-Mistral-7b). See the paper [Outcome-supervised Verifiers for Planning in Mathematical Reasoning](https://arxiv.org/pdf/2311.09724.pdf) and the code in [github](https://github.com/FreedomIntelligence/OVM)