metadata

language:
  - en
tags:
  - text-to-sql
  - mistral
  - gguf
  - sql-generation
  - cpu-inference
pipeline_tag: text-generation
license: apache-2.0

Mistral-7B SQL GGUF

A GGUF-quantized version of Mistral-7B fine-tuned for SQL query generation. Optimized for CPU inference with clean SQL outputs.

Model Details

Base Model: Mistral-7B-Instruct-v0.3
Quantization: Q8_0
Context Length: 32768 tokens (default from base model)
Format: GGUF (V3 latest)
Size: 7.17 GB
Parameters: 7.25B
Architecture: Llama
Use Case: Text to SQL conversion

Usage

from huggingface_hub import hf_hub_download
from llama_cpp import Llama

# Download and setup
model_path = hf_hub_download(
    repo_id="tharun66/mistral-sql-gguf",
    filename="mistral_sql_q4.gguf"
)

# Initialize model
llm = Llama(
    model_path=model_path,
    n_ctx=512,
    n_threads=4,
    verbose=False
)

def generate_sql(question):
    prompt = f"""### Task: Convert to SQL
### Question: {question}
### SQL:"""
    
    response = llm(
        prompt,
        max_tokens=128,
        temperature=0.7,
        stop=["system", "user", "assistant", "###"],
        echo=False
    )
    
    return response['choices'][0]['text'].strip()

# Example
question = "Show all active users"
sql = generate_sql(question)
print(sql)
# Output: SELECT * FROM users WHERE status = 'active'