TryingHard commited on
Commit
967416c
·
verified ·
1 Parent(s): 0fba527

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -13
README.md CHANGED
@@ -31,8 +31,11 @@ Built upon Ovis1.5, **Ovis1.6** further enhances high-resolution image processin
31
  | Ovis MLLMs | ViT | LLM | Model Weights | Demo |
32
  |:------------------|:-----------:|:------------------:|:---------------------------------------------------------------:|:----------------------------------------------------------------:|
33
  | Ovis1.6-Gemma2-9B | Siglip-400M | Gemma2-9B-It | [Huggingface](https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B) | [Space](https://huggingface.co/spaces/AIDC-AI/Ovis1.6-Gemma2-9B) |
 
 
 
34
 
35
- ## Quantized Model: GPTQ-Int4
36
  We quantized Ovis1.6 with AutoGPTQ. Follow these steps to run it.
37
 
38
  ### Installation
@@ -45,29 +48,28 @@ pip install numpy==1.24.3 transformers==4.44.2 pillow==10.3.0 gekko pandas
45
  ```
46
  2. Build AutoGPTQ: We customized AutoGPTQ to support Ovis model quantization. You need to build from source to install the customized version.
47
  ```bash
48
- git clone https://github.com/kq-chen/AutoGPTQ.git
49
  cd AutoGPTQ
50
  pip install -vvv --no-build-isolation -e .
51
  ```
52
  Check [this](https://github.com/AutoGPTQ/AutoGPTQ/issues/194) first if you are building inside a Docker container.
53
 
54
  ### Usage
55
- Below is a code snippet to run Ovis1.6-Gemma2-9B-GPTQ-Int4 with multimodal inputs. For additional usage instructions, including inference wrapper and Gradio UI, please refer to [Ovis GitHub](https://github.com/AIDC-AI/Ovis?tab=readme-ov-file#inference).
56
  ```python
57
  import torch
58
  from PIL import Image
59
  from transformers import GenerationConfig
60
- from auto_gptq.modeling import OvisGPTQForCausalLM
61
 
62
  # load model
63
  load_device = "cuda:0" # customize load device
64
- model = OvisGPTQForCausalLM.from_pretrained(
65
- "TryingHard/Ovis1.6-Gemma2-9B-GPTQ-Int4",
66
  device=load_device,
67
- multimodal_max_length=8192,
68
  trust_remote_code=True
69
  )
70
- model.model.generation_config = GenerationConfig.from_pretrained("TryingHard/Ovis1.6-Gemma2-9B-GPTQ-Int4")
71
  text_tokenizer = model.get_text_tokenizer()
72
  visual_tokenizer = model.get_visual_tokenizer()
73
 
@@ -156,14 +158,14 @@ for i in range(len(batch_input_ids)):
156
 
157
 
158
  ## Quantize Your Own Ovis Model with AutoGPTQ
159
- We provide a demonstration code snippet for you to quantize your own fine-tuned Ovis model. Before running the code, you need to **follow the ABOVE installation steps** to obtain an environment for quantization.
160
  ```python
161
  from typing import Dict, Sequence, Union, List
162
  import copy
163
  import logging
164
 
165
  from auto_gptq import BaseQuantizeConfig
166
- from auto_gptq.modeling import OvisGPTQForCausalLM
167
  import torch
168
  from torch.utils.data import Dataset, DataLoader
169
  from PIL import Image
@@ -187,13 +189,13 @@ quantize_config = BaseQuantizeConfig(
187
 
188
 
189
  # Load model
190
- model = OvisGPTQForCausalLM.from_pretrained(
191
  model_path,
192
  quantize_config,
193
  torch_dtype=torch.bfloat16,
194
  multimodal_max_length=8192,
195
  trust_remote_code=True
196
- )
197
  print(f"Model Loaded!")
198
 
199
 
@@ -325,7 +327,7 @@ print(f"Dataloader Loaded!")
325
 
326
 
327
  # start quantizing
328
- model.quantize(train_loader, cache_examples_on_gpu=False, samples_dtype=torch.bfloat16) # do not change samples_dtype
329
  print(f"Model Quantized! Now Saving...")
330
 
331
  model.save_quantized(quantize_save_path, use_safetensors=True)
 
31
  | Ovis MLLMs | ViT | LLM | Model Weights | Demo |
32
  |:------------------|:-----------:|:------------------:|:---------------------------------------------------------------:|:----------------------------------------------------------------:|
33
  | Ovis1.6-Gemma2-9B | Siglip-400M | Gemma2-9B-It | [Huggingface](https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B) | [Space](https://huggingface.co/spaces/AIDC-AI/Ovis1.6-Gemma2-9B) |
34
+ | Ovis1.6-Llama3.2-3B | Siglip-400M | Llama-3.2-3B-Instruct | [Huggingface](https://huggingface.co/AIDC-AI/Ovis1.6-Llama3.2-3B) | [Space](https://huggingface.co/spaces/AIDC-AI/Ovis1.6-Llama3.2-3B) |
35
+ | Ovis1.6-Gemma2-9B-GPTQ-Int4 | Siglip-400M | Gemma2-9B-It | [Huggingface](https://huggingface.co/AIDC-AI/Ovis1.6-Gemma2-9B-GPTQ-Int4) | - |
36
+ | Ovis1.6-Llama3.2-3B-GPTQ-Int4 | Siglip-400M | Llama-3.2-3B-Instruct | [Huggingface](https://huggingface.co/AIDC-AI/Ovis1.6-Llama3.2-3B-GPTQ-Int4) | - |
37
 
38
+ ## Quantized Model
39
  We quantized Ovis1.6 with AutoGPTQ. Follow these steps to run it.
40
 
41
  ### Installation
 
48
  ```
49
  2. Build AutoGPTQ: We customized AutoGPTQ to support Ovis model quantization. You need to build from source to install the customized version.
50
  ```bash
51
+ git clone https://github.com/AIDC-AI/AutoGPTQ.git
52
  cd AutoGPTQ
53
  pip install -vvv --no-build-isolation -e .
54
  ```
55
  Check [this](https://github.com/AutoGPTQ/AutoGPTQ/issues/194) first if you are building inside a Docker container.
56
 
57
  ### Usage
58
+ Below is a code snippet to run **Ovis1.6-Gemma2-9B-GPTQ-Int4** with multimodal inputs. For additional usage instructions, including inference wrapper and Gradio UI, please refer to [Ovis GitHub](https://github.com/AIDC-AI/Ovis?tab=readme-ov-file#inference).
59
  ```python
60
  import torch
61
  from PIL import Image
62
  from transformers import GenerationConfig
63
+ from auto_gptq.modeling import OvisGemma2GPTQForCausalLM
64
 
65
  # load model
66
  load_device = "cuda:0" # customize load device
67
+ model = OvisGemma2GPTQForCausalLM.from_pretrained(
68
+ "AIDC-AI/Ovis1.6-Gemma2-9B-GPTQ-Int4",
69
  device=load_device,
 
70
  trust_remote_code=True
71
  )
72
+ model.model.generation_config = GenerationConfig.from_pretrained("AIDC-AI/Ovis1.6-Gemma2-9B-GPTQ-Int4")
73
  text_tokenizer = model.get_text_tokenizer()
74
  visual_tokenizer = model.get_visual_tokenizer()
75
 
 
158
 
159
 
160
  ## Quantize Your Own Ovis Model with AutoGPTQ
161
+ We provide a demonstration code snippet for you to quantize your own fine-tuned **Ovis1.6-Gemma2-9B** model. Before running the code, you need to **follow the ABOVE installation steps** to obtain an environment for quantization.
162
  ```python
163
  from typing import Dict, Sequence, Union, List
164
  import copy
165
  import logging
166
 
167
  from auto_gptq import BaseQuantizeConfig
168
+ from auto_gptq.modeling import OvisGemma2GPTQForCausalLM
169
  import torch
170
  from torch.utils.data import Dataset, DataLoader
171
  from PIL import Image
 
189
 
190
 
191
  # Load model
192
+ model = OvisGemma2GPTQForCausalLM.from_pretrained(
193
  model_path,
194
  quantize_config,
195
  torch_dtype=torch.bfloat16,
196
  multimodal_max_length=8192,
197
  trust_remote_code=True
198
+ ).cuda()
199
  print(f"Model Loaded!")
200
 
201
 
 
327
 
328
 
329
  # start quantizing
330
+ model.quantize(train_loader, cache_examples_on_gpu=False)
331
  print(f"Model Quantized! Now Saving...")
332
 
333
  model.save_quantized(quantize_save_path, use_safetensors=True)