|
--- |
|
license: apache-2.0 |
|
tags: |
|
- unsloth |
|
- query-expansion |
|
datasets: |
|
- s-emanuilov/query-expansion |
|
base_model: |
|
- Qwen/Qwen2.5-7B-Instruct |
|
--- |
|
# Query Expansion Dataset - based on Qwen2.5-7B |
|
|
|
Fine-tuned Qwen2.5-7B model for generating search query expansions. |
|
Part of a collection of query expansion models available in different architectures and sizes. |
|
|
|
## Overview |
|
|
|
**Task:** Search query expansion |
|
**Base model:** [Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) |
|
**Training data:** [Query Expansion Dataset](https://huggingface.co/datasets/s-emanuilov/query-expansion) |
|
|
|
<img src="static/query-expansion-model.jpg" alt="Query Expansion Model" width="600px" /> |
|
|
|
|
|
## Variants |
|
### Fine-tuned models |
|
|
|
- [Qwen2.5-3B](https://huggingface.co/s-emanuilov/query-expansion-Qwen2.5-3B) |
|
- [Llama-3.2-3B](https://huggingface.co/s-emanuilov/query-expansion-Llama-3.2-3B) |
|
|
|
### GGUF variants |
|
- [Qwen2.5-3B-GGUF](https://huggingface.co/s-emanuilov/query-expansion-Qwen2.5-3B-GGUF) |
|
- [Qwen2.5-7B-GGUF](https://huggingface.co/s-emanuilov/query-expansion-Qwen2.5-7B-GGUF) |
|
- [Llama-3.2-3B-GGUF](https://huggingface.co/s-emanuilov/query-expansion-Llama-3.2-3B-GGUF) |
|
|
|
Each GGUF model is available in several quantization formats: F16, Q8_0, Q5_K_M, Q4_K_M, Q3_K_M |
|
|
|
## Details |
|
This model is designed for enhancing search and retrieval systems by generating semantically relevant query expansions. |
|
|
|
It could be useful for: |
|
- Advanced RAG systems |
|
- Search enhancement |
|
- Query preprocessing |
|
- Low-latency query expansion |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from unsloth import FastLanguageModel |
|
|
|
# Model configuration |
|
MODEL_NAME = "s-emanuilov/query-expansion-Qwen2.5-7B" |
|
MAX_SEQ_LENGTH = 2048 |
|
DTYPE = "float16" |
|
LOAD_IN_4BIT = True |
|
|
|
# Load model and tokenizer |
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
model_name=MODEL_NAME, |
|
max_seq_length=MAX_SEQ_LENGTH, |
|
dtype=DTYPE, |
|
load_in_4bit=LOAD_IN_4BIT, |
|
) |
|
|
|
# Enable faster inference |
|
FastLanguageModel.for_inference(model) |
|
|
|
# Define prompt template |
|
PROMPT_TEMPLATE = """Below is a search query. Generate relevant expansions and related terms that would help broaden and enhance the search results. |
|
|
|
### Query: |
|
{query} |
|
|
|
### Expansions: |
|
{output}""" |
|
|
|
# Prepare input |
|
query = "apple stock" |
|
inputs = tokenizer( |
|
[PROMPT_TEMPLATE.format(query=query, output="")], |
|
return_tensors="pt" |
|
).to("cuda") |
|
|
|
# Generate with streaming output |
|
from transformers import TextStreamer |
|
streamer = TextStreamer(tokenizer) |
|
output = model.generate( |
|
**inputs, |
|
streamer=streamer, |
|
max_new_tokens=128, |
|
) |
|
``` |
|
|
|
## Example |
|
|
|
**Input:** "apple stock" |
|
**Expansions:** |
|
- "current apple share value" |
|
- "latest updates on apple's market position" |
|
- "how is apple performing in the current market?" |
|
- "what is the latest information on apple's financial standing?" |
|
|
|
## Citation |
|
|
|
If you find my work helpful, feel free to give me a citation. |
|
|
|
``` |
|
|
|
``` |