Spaces:
Build error
Build error
Update dataset with title/texts & 1m -> 41m in markdown
Browse files
app.py
CHANGED
@@ -9,7 +9,7 @@ import faiss
|
|
9 |
from usearch.index import Index
|
10 |
|
11 |
# Load titles and texts
|
12 |
-
title_text_dataset = load_dataset("mixedbread-ai/wikipedia-
|
13 |
|
14 |
# Load the int8 and binary indices. Int8 is loaded as a view to save memory, as we never actually perform search with it.
|
15 |
int8_view = Index.restore("wikipedia_int8_usearch_50m.index", view=True)
|
@@ -75,15 +75,15 @@ with gr.Blocks(title="Quantized Retrieval") as demo:
|
|
75 |
gr.Markdown(
|
76 |
"""
|
77 |
## Quantized Retrieval - Binary Search with Scalar (int8) Rescoring
|
78 |
-
This demo showcases
|
79 |
|
80 |
<details><summary>Click to learn about the retrieval process</summary>
|
81 |
|
82 |
Details:
|
83 |
1. The query is embedded using the [`mixedbread-ai/mxbai-embed-large-v1`](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1) SentenceTransformer model.
|
84 |
2. The query is quantized to binary using the `quantize_embeddings` function from the SentenceTransformers library.
|
85 |
-
3. A binary index (
|
86 |
-
4. The top 40 documents are loaded on the fly from an int8 index on disk (
|
87 |
5. The top 40 documents are rescored using the float32 query and the int8 embeddings to get the top 10 documents.
|
88 |
6. The top 10 documents are sorted by score and displayed.
|
89 |
|
|
|
9 |
from usearch.index import Index
|
10 |
|
11 |
# Load titles and texts
|
12 |
+
title_text_dataset = load_dataset("mixedbread-ai/wikipedia-data-en-2023-11", split="train", num_proc=4).select_columns(["title", "text"])
|
13 |
|
14 |
# Load the int8 and binary indices. Int8 is loaded as a view to save memory, as we never actually perform search with it.
|
15 |
int8_view = Index.restore("wikipedia_int8_usearch_50m.index", view=True)
|
|
|
75 |
gr.Markdown(
|
76 |
"""
|
77 |
## Quantized Retrieval - Binary Search with Scalar (int8) Rescoring
|
78 |
+
This demo showcases exact retrieval using [quantized embeddings](https://huggingface.co/blog/embedding-quantization). The corpus consists of 41 million texts from Wikipedia articles.
|
79 |
|
80 |
<details><summary>Click to learn about the retrieval process</summary>
|
81 |
|
82 |
Details:
|
83 |
1. The query is embedded using the [`mixedbread-ai/mxbai-embed-large-v1`](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1) SentenceTransformer model.
|
84 |
2. The query is quantized to binary using the `quantize_embeddings` function from the SentenceTransformers library.
|
85 |
+
3. A binary index (41M binary embeddings; 5.2GB of memory/disk space) is searched using the quantized query for the top 40 documents.
|
86 |
+
4. The top 40 documents are loaded on the fly from an int8 index on disk (41M int8 embeddings; 0 bytes of memory, 47.5GB of disk space).
|
87 |
5. The top 40 documents are rescored using the float32 query and the int8 embeddings to get the top 10 documents.
|
88 |
6. The top 10 documents are sorted by score and displayed.
|
89 |
|