No model found error
Hi, I'm trying to use the model offline. I downloaded the models, both , restructured models--jinaai--jina-embeddings-v3 and models--jinaai--xlm-roberta-flash-implementation, and added config.json with model_type.
It works for transformers , but gives an error for sentence_transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("/home//.cache/huggingface/models--jinaai--jina-embeddings-v3", trust_remote_code=True)
import json
# Path to your config.json file
config_path = "/home//.cache/huggingface/models--jinaai--jina-embeddings-v3/config.json"
# Load the config file
with open(config_path, "r") as config_file:
config = json.load(config_file)
# Check if the 'task_instructions' key exists
task_instructions = config.get('task_instructions', {})
# Print the loaded task instructions
print(task_instructions)
task = "retrieval.query"
embeddings = model.encode(
["What is the weather like in Berlin today?"],
task=task,
prompt_name=task,
)
No sentence-transformers model found with name /home//.cache/huggingface/models--jinaai--jina-embeddings-v3. Creating a new one with mean pooling.
{'retrieval.query': 'Represent the query for retrieving evidence documents: ', 'retrieval.passage': 'Represent the document for retrieval: ', 'separation': '', 'classification': '', 'text-matching': ''}
Traceback (most recent call last):
File "/home//miniconda3/lib/python3.12/site-packages/sentence_transformers/SentenceTransformer.py", line 534, in encode
prompt = self.prompts[prompt_name]
~~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: 'retrieval.query'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home//mycode/jina_sentence_trnasformer.py", line 23, in <module>
embeddings = model.encode(
^^^^^^^^^^^^^
File "/home//miniconda3/lib/python3.12/site-packages/sentence_transformers/SentenceTransformer.py", line 536, in encode
raise ValueError(
ValueError: Prompt name 'retrieval.query' not found in the configured prompts dictionary with keys [].
Any guidance would be appreciated.
Hello!
Does your model directory (/home//.cache/huggingface/models--jinaai--jina-embeddings-v3
) contain a config_sentence_transformers.json
file? (https://huggingface.co/jinaai/jina-embeddings-v3/blob/main/config_sentence_transformers.json)
This should be used to load the prompts, i.e. the one with the key "retrieval.query" in your case. This prompt wasn't loaded, resulting in your crash.
- Tom Aarsen
@tomaarsen Thanks for your quick reply! No, I didn't move that one. While I'm moving, can you tell me any other files I should relocate so that I can use the full features?
.
โโโ config.json
โโโ jinaai
โ โโโ xlm-roberta-flash-implementation
โ โโโ block.py
โ โโโ configuration_xlm_roberta.py
โ โโโ embedding.py
โ โโโ mha.py
โ โโโ mlp.py
โ โโโ modeling_lora.py
โ โโโ modeling_xlm_roberta.py
โ โโโ rotary.py
โ โโโ stochastic_depth.py
โ โโโ xlm_padding.py
โโโ model.safetensors
โโโ special_tokens_map.json
โโโ tokenizer.json
โโโ tokenizer_config.json
My recommendation is to:
- Clone this repository (https://huggingface.co/jinaai/jina-embeddings-v3/blob/main/config_sentence_transformers.json?clone=true) to get all files
- Copy all files from the implementation repository into the local directory, i.e. adjacent to the model.safetensors etc.
- Update the config.json (from the jina-embeddings-v3) cloning and turn the "auto_map" values from "jinaai/xlm-roberta-flash-implementation--file.class" into just "file.class". Because the files are adjacent, it'll be able to find it.
Then you can load the model by providing the path to the directory that you cloned into. Then you only have 1 directory with everything in it (it does have some necessary subdirectories)
Beyond that, this ensures that you have all files and that your performance should be the same as the remote model. But you're free to test with some arbitrary texts to make sure the embeddings are the same.
- Tom Aarsen
Hi Tom @tomaarsen , Thank you so much. I rearranged as you guided and It works fine ๐
If you don't mind, can you give me some follow-ups?
- I couldn't find Lora adapter files, which seems to be merged in safetensors. My idea is fine-tuning over the Lora Adapter. Do you have any idea?
- I'm a bit concerned that my finetuing could overfit, so can you guide a miminum size of dataset that doesn't hurt the performance?
- Is Matryoshka embedding ionly enabled when I set truncate_dim=, or just default on ?
maybe model authors can just patch it to make it usable without shamanism?