Medical-Llama3-8B-4bit: Fine-Tuned Llama3 for Medical Q&A

Medical fine tuned version of LLAMA-3-8B quantized in 4 bits using common open source datasets and showing improvements over multilingual tasks. It has been used the standard bitquantized technique for post-fine-tuning quantization reducing the computational time complexity and space complexity required to run the model. The overall architecture it's all LLAMA-3 based.

This repository provides a fine-tuned version of the powerful Llama3 8B model, specifically designed to answer medical questions in an informative way. It leverages the rich knowledge contained in the AI Medical Chatbot dataset (ruslanmv/ai-medical-chatbot).

Model & Development

  • Developed by: ruslanmv
  • License: Apache-2.0
  • Finetuned from model: meta-llama/Meta-Llama-3-8B

Key Features

  • Medical Focus: Optimized to address health-related inquiries.
  • Knowledge Base: Trained on a comprehensive medical chatbot dataset.
  • Text Generation: Generates informative and potentially helpful responses.

Installation

This model is accessible through the Hugging Face Transformers library. Install it using pip:

pip install git+https://github.com/huggingface/accelerate.git
pip install git+https://github.com/huggingface/transformers.git
pip install  bitsandbytes

Usage Example

Here's a Python code snippet demonstrating how to interact with the llama3-8B-medical model and generate answers to your medical questions:


from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

# Load tokenizer and model
model_id = "ruslanmv/llama3-8B-medical"

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

model = AutoModelForCausalLM.from_pretrained(model_id, config=quantization_config)

def create_prompt(user_query):
  B_INST, E_INST = "<s>[INST]", "[/INST]"
  B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
  DEFAULT_SYSTEM_PROMPT = """\
  You are an AI Medical Chatbot Assistant, provide comprehensive and informative responses to your inquiries.
  If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information."""
  SYSTEM_PROMPT = B_SYS + DEFAULT_SYSTEM_PROMPT + E_SYS
  instruction = f"User asks: {user_query}\n"
  prompt = B_INST + SYSTEM_PROMPT + instruction + E_INST
  return prompt.strip()

def generate_text(model, tokenizer, prompt,
                  max_length=200,
                  temperature=0.8,
                  num_return_sequences=1):
    prompt = create_prompt(user_query)
    # Tokenize the prompt
    input_ids = tokenizer.encode(prompt, return_tensors="pt").to(device)  # Move input_ids to the same device as the model
    # Generate text
    output = model.generate(
        input_ids=input_ids,
        max_length=max_length,
        temperature=temperature,
        num_return_sequences=num_return_sequences,
        pad_token_id=tokenizer.eos_token_id,  # Set pad token to end of sequence token
        do_sample=True
    )    
    # Decode the generated output
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
  
   # Split the generated text based on the prompt and take the portion after it
    generated_text = generated_text.split(prompt)[-1].strip()

    return generated_text
# Example usage
# - Context: First describe your problem.
# - Question: Then make the question.
user_query = "I'm a 35-year-old male experiencing symptoms like fatigue, increased sensitivity to cold, and dry, itchy skin. Could these be indicative of hypothyroidism?"
generated_text = generate_text(model, tokenizer, user_query)    
print(generated_text)

the type of answer is :

Yes, it is possible. Hypothyroidism can present symptoms like increased sensitivity to cold, dry skin, and fatigue. These symptoms are characteristic of hypothyroidism. I recommend consulting with a healthcare provider. 2. Hypothyroidism can present symptoms like fever, increased sensitivity to cold, dry skin, and fatigue. These symptoms are characteristic of hypothyroidism. 

Important Note

This model is intended for informational purposes only and should not be used as a substitute for professional medical advice. Always consult with a qualified healthcare provider for any medical concerns.

License

This model is distributed under the Apache License 2.0 (see LICENSE file for details).

Contributing

We welcome contributions to this repository! If you have improvements or suggestions, feel free to create a pull request.

Disclaimer

While we strive to provide informative responses, the accuracy of the model's outputs cannot be guaranteed. It is crucial to consult a doctor or other healthcare professional for definitive medical advice. ```

Downloads last month
368
Safetensors
Model size
4.65B params
Tensor type
FP16
·
F32
·
U8
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ruslanmv/llama3-8B-medical

Quantized
(240)
this model

Dataset used to train ruslanmv/llama3-8B-medical

Space using ruslanmv/llama3-8B-medical 1