nomic-ai
/

modernbert-embed-base

Sentence Similarity

sentence-transformers

ONNX

Model card Files Files and versions Community

zpn

Xenova HF staff commited on 3 days ago

Commit

bb0033c

•

1 Parent(s): 5960f15

Improve Transformers.js code snippet (#6)

Browse files

- Improve Transformers.js code snippet (a6809227ef568c6f8a14a2226ba18e4fa42c776f)
- Upload README.md (0bab91d33ed6f45be1e858d2b0a15e50478cccd4)

Co-authored-by: Joshua <[email protected]>

Files changed (1) hide show

README.md +27 -26

README.md CHANGED Viewed

@@ -5,6 +5,7 @@ tags:
 - feature-extraction
 - sentence-similarity
 - mteb
 model-index:
 - name: binarize_False
   results:
@@ -3083,38 +3084,38 @@ Note the small differences compared to the full 768-dimensional similarities.
 ### Transformers.js
-```javascript
-import { pipeline } from '@xenova/transformers';
-// Create a feature extraction pipeline
-const extractor = await pipeline('feature-extraction', 'nomic-ai/modernbert-embed-base', {
-    quantized: false, // Comment out this line to use the quantized version
-});
-// Compute sentence embeddings
-const texts = ['search_query: What is TSNE?', 'search_query: Who is Laurens van der Maaten?'];
-const embeddings = await extractor(texts, { pooling: 'mean', normalize: true });
-console.log(embeddings);
 ```
-<details><summary>Click to see Transformers.js usage with different quantizations</summary>
 ```javascript
-import { pipeline } from '@xenova/transformers';
 // Create a feature extraction pipeline
-const extractor = await pipeline('feature-extraction', 'nomic-ai/modernbert-embed-base', {
-    dtype: 'q4f16',
-});
-// Compute sentence embeddings
-const texts = ['search_query: What is TSNE?', 'search_query: Who is Laurens van der Maaten?'];
-const embeddings = await extractor(texts, { pooling: 'mean', normalize: true });
-console.log(embeddings);
-```
-</details>
 ## Training
@@ -3152,4 +3153,4 @@ If you find the model, dataset, or training code useful, please cite our work
       archivePrefix={arXiv},
       primaryClass={cs.CL}
 }
-```

 - feature-extraction
 - sentence-similarity
 - mteb
+- transformers.js
 model-index:
 - name: binarize_False
   results:
 ### Transformers.js
+If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
+```bash
+npm i @huggingface/transformers
 ```
+Then, you can compute embeddings as follows:
 ```javascript
+import { pipeline, matmul } from '@huggingface/transformers';
 // Create a feature extraction pipeline
+const extractor = await pipeline(
+  "feature-extraction",
+  "nomic-ai/modernbert-embed-base",
+  { dtype: "fp32" }, // Supported options: "fp32", "fp16", "q8", "q4", "q4f16"
+);
+// Embed queries and documents
+const query_embeddings = await extractor([
+    "search_query: What is TSNE?",
+    "search_query: Who is Laurens van der Maaten?",
+  ], { pooling: "mean", normalize: true },
+);
+const doc_embeddings = await extractor([
+    "search_document: TSNE is a dimensionality reduction algorithm created by Laurens van Der Maaten",
+  ], { pooling: "mean", normalize: true },
+);
+// Compute similarity scores
+const similarities = await matmul(query_embeddings, doc_embeddings.transpose(1, 0));
+console.log(similarities.tolist()); // [[0.721383273601532], [0.3259955644607544]]
+```
 ## Training
       archivePrefix={arXiv},
       primaryClass={cs.CL}
 }
+```