I found that obtaining text similarity is very slow. Are there any faster methods?

#3
by caochengchen - opened

I found that obtaining text similarity is very slow. Are there any faster methods?
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("aari1995/German_Semantic_V3b", trust_remote_code=True)

sentences = [
"Ein Mann übt Boxen",
"Ein Affe praktiziert Kampfsportarten.",
"Eine Person faltet ein Blatt Papier.",
"Eine Frau geht mit ihrem Hund spazieren."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)

[4, 4]

Hi, so if you keep the model in memory and do not do "model = SentenceTransformer..." every time you encode something, it will be alot faster. Or what do you mean?

Hi, so if you keep the model in memory and do not do "model = SentenceTransformer..." every time you encode something, it will be alot faster. Or what do you mean?

I have already downloaded the model and saved it locally. I use the method of loading the local model to load it, for example: model = SentenceTransformer("text_model/German_Semantic_V3b", trust_remote_code=True). However, I found that extracting text features is very slow, mainly because loading the model is very slow. Why is this happening?

If I don't use model = SentenceTransformer..., how can I obtain text embedding features? Could you provide a code example?
Also, do you have a code repository on GitHub?

Sign up or log in to comment