rokeya71 commited on
Commit
6e85e8f
·
verified ·
1 Parent(s): 9907644

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -1
README.md CHANGED
@@ -6,4 +6,45 @@ pipeline_tag: feature-extraction
6
  tags:
7
  - rag
8
  - embedding
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  tags:
7
  - rag
8
  - embedding
9
+ ---
10
+
11
+ # ONNX Converted Version of IBM Granite Embedding Model
12
+
13
+ This repository contains the ONNX converted version of the Hugging Face model [IBM Granite Embedding 125M English](https://huggingface.co/ibm-granite/granite-embedding-125m-english).
14
+
15
+ ## Running the Model
16
+
17
+ You can run the ONNX model using the following code:
18
+
19
+ ```python
20
+ import onnxruntime as ort
21
+ from transformers import AutoTokenizer
22
+ import numpy as np
23
+
24
+ # Define paths
25
+ model_path = "./onnx/model_uint8.onnx" # Path to ONNX model file
26
+ tokenizer_path = "./onnx/" # Path to folder containing tokenizer.json and tokenizer_config.json
27
+
28
+ # Load tokenizer
29
+ tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)
30
+
31
+ # Load ONNX model using ONNX Runtime
32
+ onnx_session = ort.InferenceSession(model_path)
33
+
34
+ # Example text input
35
+ text = "hi."
36
+
37
+ # Tokenize input
38
+ inputs = tokenizer(text, return_tensors="np", truncation=True, padding=True)
39
+
40
+ # Prepare input for ONNX model
41
+ onnx_inputs = {key: inputs[key].astype(np.int64) for key in inputs.keys()}
42
+
43
+ # Run inference
44
+ outputs = onnx_session.run(None, onnx_inputs)
45
+
46
+ # Extract embeddings (e.g., using mean pooling)
47
+ last_hidden_state = outputs[0] # Assuming the first output is the last hidden state
48
+ pooled_embedding = last_hidden_state.mean(axis=1) # Mean pooling over the sequence dimension
49
+
50
+ print(f"Embedding: {pooled_embedding}")