rokeya71
/

granite-embedding-125m-english-onnx

Feature Extraction

Model card Files Files and versions Community

rokeya71 commited on 13 days ago

Commit

6e85e8f

·

verified ·

1 Parent(s): 9907644

Update README.md

Files changed (1) hide show

README.md +42 -1

README.md CHANGED Viewed

@@ -6,4 +6,45 @@ pipeline_tag: feature-extraction
 tags:
 - rag
 - embedding
----

 tags:
 - rag
 - embedding
+---
+# ONNX Converted Version of IBM Granite Embedding Model
+This repository contains the ONNX converted version of the Hugging Face model [IBM Granite Embedding 125M English](https://huggingface.co/ibm-granite/granite-embedding-125m-english).
+## Running the Model
+You can run the ONNX model using the following code:
+```python
+import onnxruntime as ort
+from transformers import AutoTokenizer
+import numpy as np
+# Define paths
+model_path = "./onnx/model_uint8.onnx"  # Path to ONNX model file
+tokenizer_path = "./onnx/"  # Path to folder containing tokenizer.json and tokenizer_config.json
+# Load tokenizer
+tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)
+# Load ONNX model using ONNX Runtime
+onnx_session = ort.InferenceSession(model_path)
+# Example text input
+text = "hi."
+# Tokenize input
+inputs = tokenizer(text, return_tensors="np", truncation=True, padding=True)
+# Prepare input for ONNX model
+onnx_inputs = {key: inputs[key].astype(np.int64) for key in inputs.keys()}
+# Run inference
+outputs = onnx_session.run(None, onnx_inputs)
+# Extract embeddings (e.g., using mean pooling)
+last_hidden_state = outputs[0]  # Assuming the first output is the last hidden state
+pooled_embedding = last_hidden_state.mean(axis=1)  # Mean pooling over the sequence dimension
+print(f"Embedding: {pooled_embedding}")