Model not running on CPU, due to flash_attn package requirement.

by Akash1003 - opened 13 days ago

13 days ago

I am trying to import the prosparse-llama-2-7b, model on ARM CPU machine. (gr3 instance)

It requires flash_attn, and if we try to install flash_attn, it raises an nvcc error.

Other SparseLLM models like SparseLLM/ReluLLaMA-7B and https://huggingface.co/SparseLLM/ProSparse-MiniCPM-1B-sft seem to work, and execute on CPUs, the issue is with this particular model only and the larger variants too, i.e. prosparse-llama2-13b.

Requesting to look into this issue, I think we need to get rid of the flash_attn dependency, because otherwise the model won't be able to execute on CPUs.

Raincleared

SparseLLMs org 12 days ago

The problems seem strange. I've tried to load the model on a CPU machine with model = AutoModelForCausalLM.from_pretrained("SparseLLM/prosparse-llama-2-7b", torch_dtype=torch.bfloat16, trust_remote_code=True) and succeed. Generally, if you have no GPU on the machine or flash-attn is not installed, transformers.utils.is_flash_attn_2_available() should return False so that flash_attn will not be required. You may check line 47 in modeling_sparsellama.py.

Therefore, you may check the return value of is_flash_attn_2_available(). The package flash-attn and GPUs are not necessary to load these models on CPU machines.

Akash1003

11 days ago

•

edited 11 days ago

Hello, thanks for responding.
Could you send the script/ and environment details to use, in which the model seems to be running?

I tried again, on both x86 and arm instances, but the issue is persisting.

Here is the output of pip list of my environment on the gr3 instance, I am using pytorch 2.5.1:

Raincleared

SparseLLMs org 11 days ago

From your pictures, the problems seem to lie in the import phase of package transformers. My transformers version is 4.43.3. If changing the version cannot solve this problem, I suggest you dive deep into the source codes that raise the exception to fix it.

Akash1003

11 days ago

Thanks, I updated the transformers version and it seems to work!

Akash1003 changed discussion status to closed 11 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment