Help needed to load model

#13
by sanjay-dev-ds-28 - opened

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose

n_gpu_layers = 40 # Change this value based on your model and your GPU VRAM pool.
n_batch = 256 # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.

Loading model,

llm = LlamaCpp(
model_path=model_path,
max_tokens=256,
n_gpu_layers=n_gpu_layers,
n_batch=n_batch,
callback_manager=callback_manager,
n_ctx=1024,
verbose=False,
)

ValidationError: 1 validation error for LlamaCpp
root
Could not load Llama model from path: /root/.cache/huggingface/hub/models--TheBloke--Llama-2-13B-chat-GGML/snapshots/47d28ef5de4f3de523c421f325a2e4e039035bab/llama-2-13b-chat.ggmlv3.q5_1.bin. Received error fileno (type=value_error)

same problem

Same problem :(

llama.cpp and llama-cpp-python only support GGUF (not GGML) after a certain version - so try this
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip -qq install --upgrade --force-reinstall llama-cpp-python==0.1.78 --no-cache-dir

I will be making GGUFs for these models tonight, so they're coming very soon

@actionpace tried !CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip -qq install --upgrade --force-reinstall llama-cpp-python==0.1.78 --no-cache-dir with the same result :(

So we will have to wait for the GGUFs versions :)

Have you tried my version in my repo?

yup @akarshanbiswas same result

/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py in __init__(__pydantic_self__, **data)
    339         values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
    340         if validation_error:
--> 341             raise validation_error
    342         try:
    343             object_setattr(__pydantic_self__, '__dict__', values)

ValidationError: 1 validation error for LlamaCpp
__root__
  Could not load Llama model from path: /root/.cache/huggingface/hub/models--akarshanbiswas--llama-2-chat-13b-gguf/snapshots/141acdcfecba05f5c0e046ee0339863fc9621004/ggml-llama-2-13b-chat-q4_k_m.gguf. Received error fileno (type=value_error)

It works correctly here.

Edit: replaced the console log with a screenshot:

image.png

Which version of llama.cpp python are you using?

I just do

!pip install llama-cpp-python

and then

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose

also tried with

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip -qq install --upgrade --force-reinstall llama-cpp-python==0.1.78 --no-cache-dir

model_name_or_path = "akarshanbiswas/llama-2-chat-13b-gguf"
model_basename = "ggml-llama-2-13b-chat-q4_k_m.gguf"
model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)

n_gpu_layers = 40
n_batch = 256 

# Loading model,
llm = LlamaCpp(
    model_path=model_path,
    max_tokens=256,
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    callback_manager=callback_manager,
    n_ctx=1024,
    verbose=False,
)

Try downloading it using browser. Save it to a location and pass the file path to the class

Same result on collab sorry :(

Try with:

curl -OL https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/llama-2-13b-chat.ggmlv3.q5_1.bin

Oh, I see, you need the GGUF version

https://huggingface.co/TheBloke/CodeLlama-13B-GGUF

I have the same problem and couldn't find any solution yet

Fix for "Could not load Llama model from path":

Download GGUF model from this link:
https://huggingface.co/TheBloke/CodeLlama-13B-Python-GGUF

Code Example:

model_name_or_path = "TheBloke/CodeLlama-13B-Python-GGUF"
model_basename = "codellama-13b-python.Q5_K_M.gguf"
model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)

Then Change "verbose=False" to "verbose=True" like the following code:

llm = LlamaCpp(
model_path=model_path,
max_tokens=256,
n_gpu_layers=n_gpu_layers,
n_batch=n_batch,
callback_manager=callback_manager,
n_ctx=1024,
verbose=True,
)

Please @TheBloke , is there GGUF for 7B-Chat yet? I can't seem to find one.

Thank you, @TheBloke

Fix for "Could not load Llama model from path":

Download GGUF model from this link:
https://huggingface.co/TheBloke/CodeLlama-13B-Python-GGUF

Code Example:

model_name_or_path = "TheBloke/CodeLlama-13B-Python-GGUF"
model_basename = "codellama-13b-python.Q5_K_M.gguf"
model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)

Then Change "verbose=False" to "verbose=True" like the following code:

llm = LlamaCpp(
model_path=model_path,
max_tokens=256,
n_gpu_layers=n_gpu_layers,
n_batch=n_batch,
callback_manager=callback_manager,
n_ctx=1024,
verbose=True,
)

Thank you. This worked for me. Any ideas why this might be the case?

Sign up or log in to comment