Ready to use Mistral-7B-Instruct-v0.1-GGUF model as OpenAI API compatible endpoint

#2
by limcheekin - opened

Hi there,

I deployed the model as OpenAI API compatible endpoint at https://huggingface.co/spaces/limcheekin/Mistral-7B-Instruct-v0.1-GGUF.

Also, I created a jupyter notebook to get you started to use the API endpoint in no time.

Lastly, if you find this resource valuable, your support in the form of starring the space would be greatly appreciated.

Thank you.

Hi, thank you for your work. I tried embedding endpoint and I got an error.
Query:

curl -X 'POST' \
  'https://limcheekin-mistral-7b-instruct-v0-1-gguf.hf.space/v1/embeddings' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "input": "The food was delicious and the waiter..."
}'

Error reponse:

{
  "error": {
    "message": "Llama model must be created with embedding=True to call this method",
    "type": "internal_server_error",
    "param": null,
    "code": null
  }
}

Hi, thank you for your work. I tried embedding endpoint and I got an error.
Query:

curl -X 'POST' \
  'https://limcheekin-mistral-7b-instruct-v0-1-gguf.hf.space/v1/embeddings' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "input": "The food was delicious and the waiter..."
}'

Error reponse:

{
  "error": {
    "message": "Llama model must be created with embedding=True to call this method",
    "type": "internal_server_error",
    "param": null,
    "code": null
  }
}

Yeah, I should stated clearly in the doc the embeddings endpoint has been disabled on purpose as I tested that the embeddings created by Llama models is NOT better than other open-source text embeddings models such as BAAI/bge-large-en, intfloat/e5-large-v2, sentence-transformers/all-MiniLM-L6-v2, sentence-transformers/all-mpnet-base-v2, etc. Hence, I created the Python package at https://github.com/limcheekin/open-text-embeddings.

Anyway, that's just my experience of few months ago and my current understanding, I just enabled (turn on) the embeddings endpoint and go ahead and test it out yourself and appreciate you share the result here.

Thank you.

Sign up or log in to comment