MedSSS-8B-Policy

Introduction

MedSSS-Policy is a the policy model designed for slow-thinking medical reasoning. It will conduct explicit step-wise reasoning and finalize the answer at the end of the response.

For more information, visit our GitHub repository: https://github.com/pixas/MedSSS.

Usage

We build the policy model as a LoRA adapter, which saves the memory to use it. As this LoRA adapter is built on Meta-Llama3.1-8B-Instruct, you need to first prepare the base model in your platform. You can deploy it with tools like vllm or Sglang, or perform direct inference:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct",torch_dtype="auto",device_map="auto")
model = PeftModel.from_pretrained(base_model, "pixas/MedSSS_Policy", torc_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("pixas/MedSSS_Policy")
input_text = "How to stop a cough?"
messages = [{"role": "user", "content": input_text}]
inputs = tokenizer(tokenizer.apply_chat_template(messages, tokenize=False,add_generation_prompt=True
), return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

MedSSS-Policy adopts a step-wise reasoning approach, with outputs formatted as:

Step 0: Let's break down this problem step by step.
Step 1: ...
[several steps]
Step N: [last reasoning step]\n\nThe answer is {answer}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for pixas/MedSSS_Policy

Finetuned
(848)
this model