THAI-BLIP-2

fine-tuned for image captioning task from blip2-opt-2.7b-coco with MSCOCO2017 thai caption.

How to use:

from transformers import Blip2ForConditionalGeneration, Blip2Processor
from PIL import Image
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

processor = Blip2Processor.from_pretrained("kkatiz/THAI-BLIP-2")
model = Blip2ForConditionalGeneration.from_pretrained("kkatiz/THAI-BLIP-2", device_map=device, torch_dtype=torch.bfloat16)

img = Image.open("Your image...")
inputs = processor(images=img, return_tensors="pt").to(device, torch.bfloat16)

# Adjust your `max_length`
generated_ids = model.generate(**inputs, max_length=20)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(generated_text)
Downloads last month
52
Safetensors
Model size
9.63B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kkatiz/THAI-BLIP-2

Finetuned
(4)
this model