|
--- |
|
license: apache-2.0 |
|
base_model: |
|
- sentence-transformers/all-MiniLM-L6-v2 |
|
--- |
|
**This model is a neuron compiled version of https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 *** |
|
|
|
It was compiled on version 2.19.1 of the Neuron SDK. You may need to run the compilation process again. |
|
|
|
See https://huggingface.co/docs/optimum-neuron/en/inference_tutorials/sentence_transformers for more details |
|
|
|
For information on how to run on SageMaker: https://huggingface.co/docs/optimum-neuron/en/inference_tutorials/sentence_transformers |
|
|
|
To run: |
|
|
|
``` |
|
from optimum.neuron import NeuronModelForSentenceTransformers |
|
from transformers import AutoTokenizer |
|
model_id = "jburtoft/all-MiniLM-L6-v2-neuron" |
|
|
|
# Use the line below if you have to compile the model yourself |
|
#model_id = "all-MiniLM-L6-v2-neuron" |
|
|
|
|
|
model = NeuronModelForSentenceTransformers.from_pretrained(model_id) |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
|
# Run inference |
|
prompt = "I like to eat apples" |
|
encoded_input = tokenizer(prompt, return_tensors='pt') |
|
outputs = model(**encoded_input) |
|
|
|
token_embeddings = outputs.token_embeddings |
|
sentence_embedding = outputs.sentence_embedding |
|
|
|
print(f"token embeddings: {token_embeddings.shape}") # torch.Size([1, 7, 384]) |
|
print(f"sentence_embedding: {sentence_embedding.shape}") # torch.Size([1, 384]) |
|
``` |
|
|
|
To compile: |
|
``` |
|
optimum-cli export neuron -m sentence-transformers/all-MiniLM-L6-v2 --sequence_length 512 --batch_size 1 --task feature-extraction all-MiniLM-L6-v2-neuron |
|
``` |