Spaces:
Runtime error
Apply for community grant: Academic project (gpu and storage)
We are building a demo of the model proposed in our paper "Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference". This is the first attempt to extend Mamba language models to Multi-Modal Large Language Models. Our model achieves comparable performance with only 43% of the parameters of LLaVA-7B and generates answers four times faster than models with the same number of parameters. Here are some useful URLs that can help you know more about our work:
Demo: https://huggingface.co/spaces/han1997/cobra
Project page: https://sites.google.com/view/cobravlm
Weight: https://huggingface.co/han1997/cobra
Github: https://github.com/h-zhao1997/cobra
ArXiv: https://arxiv.org/abs/2403.14520
I am currently shouldering the expenses associated with running this online demo personally. I want to initiate this application to inquire whether you might be interested in providing some free or discounted quota to help us promote our work more effectively and enhance Huggingface's influence among its users.
Hi @han1997 , we have assigned a gpu to this space. Note that GPU Grants are provided temporarily and might be removed after some time if the usage is very low.
To learn more about GPUs in Spaces, please check out https://huggingface.co/docs/hub/spaces-gpus
BTW, I'm wondering if the persistent storage is used in this Space. As 50GB of non-persistent storage is available for Spaces, I don't think this Space needs it. (docs)
I removed it but it seems to be working fine, so hope you don't mind. (FYI, to give a grant for the persistent storage, I needed to delete it first anyway.)
Let us know if your Space actually need the persistent storage.
@han1997
I just sent you an invitation to join the ZeroGPU explorers org. We recently started to use ZeroGPU as the default hardware for grants, so it would be nice if you could check out the compatibility and usage section of the org card and see if your Space can run on ZeroGPU.
You can duplicate your Space privately and assign ZeroGPU to test it, and once you confirmed that your Space can run on it, you can update this Space to use ZeroGPU and delete the private duplicate.
Looks like your Space is installing mamba-ssm
and causal-conv1d
at startup. https://huggingface.co/spaces/han1997/cobra/blob/d5b514bb04b26e47b93f716ba6dafdd5a7a11d59/app.py#L21-L22
I'm not sure if they are exactly the same packages but in the case of VideoMamba Space, I built a wheel in my local environment with CUDA using docker. https://huggingface.co/spaces/OpenGVLab/VideoMamba/discussions/2
If they are exactly the same packages, I guess you can just use those pre-built wheel for your Space, and if not, I think you can build wheels for your packages in a similar way.
@han1997
Thanks for testing ZeroGPU! Regarding the timm
issue, is it a jit
related error? If so, maybe doing the following fixes it.
import torch
torch.jit.script = lambda f: f
import timm
Recently, there was a similar issue related to ZeroGPU in another Space, and our infra team suggested this and it worked for them.
CUDA is not available outside of the function decorated with @spaces.GPU
, so we cannot use JIT compile. But timm
has some functions decorated with @torch.jit.script
, so simply importing timm
raises an error. However, the above code replaces torch.jit.script
with a function that does nothing before importing timm
, we can avoid the error.