Apply for community grant: Academic project (gpu and storage)

#2
by han1997 - opened

We are building a demo of the model proposed in our paper "Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference". This is the first attempt to extend Mamba language models to Multi-Modal Large Language Models. Our model achieves comparable performance with only 43% of the parameters of LLaVA-7B and generates answers four times faster than models with the same number of parameters. Here are some useful URLs that can help you know more about our work:

Demo: https://huggingface.co/spaces/han1997/cobra
Project page: https://sites.google.com/view/cobravlm
Weight: https://huggingface.co/han1997/cobra
Github: https://github.com/h-zhao1997/cobra
ArXiv: https://arxiv.org/abs/2403.14520

I am currently shouldering the expenses associated with running this online demo personally. I want to initiate this application to inquire whether you might be interested in providing some free or discounted quota to help us promote our work more effectively and enhance Huggingface's influence among its users.

Here is one of the VQA results of our model:
cobra_demo.png

Hi @han1997 , we have assigned a gpu to this space. Note that GPU Grants are provided temporarily and might be removed after some time if the usage is very low.

To learn more about GPUs in Spaces, please check out https://huggingface.co/docs/hub/spaces-gpus

BTW, I'm wondering if the persistent storage is used in this Space. As 50GB of non-persistent storage is available for Spaces, I don't think this Space needs it. (docs)
I removed it but it seems to be working fine, so hope you don't mind. (FYI, to give a grant for the persistent storage, I needed to delete it first anyway.)
Let us know if your Space actually need the persistent storage.

@han1997 I just sent you an invitation to join the ZeroGPU explorers org. We recently started to use ZeroGPU as the default hardware for grants, so it would be nice if you could check out the compatibility and usage section of the org card and see if your Space can run on ZeroGPU.
You can duplicate your Space privately and assign ZeroGPU to test it, and once you confirmed that your Space can run on it, you can update this Space to use ZeroGPU and delete the private duplicate.

Looks like your Space is installing mamba-ssm and causal-conv1d at startup. https://huggingface.co/spaces/han1997/cobra/blob/d5b514bb04b26e47b93f716ba6dafdd5a7a11d59/app.py#L21-L22
I'm not sure if they are exactly the same packages but in the case of VideoMamba Space, I built a wheel in my local environment with CUDA using docker. https://huggingface.co/spaces/OpenGVLab/VideoMamba/discussions/2
If they are exactly the same packages, I guess you can just use those pre-built wheel for your Space, and if not, I think you can build wheels for your packages in a similar way.

Thank you very much @hysts ! It seems we don't really need persistent storage, so removing it shouldn't be a problem. As for the issues with zero-GPU and building wheels, we need to do some further investigation. Thanks again for the grant you've provided us!

han1997 changed discussion status to closed
han1997 changed discussion status to open
han1997 changed discussion status to closed

@hysts Hi! Unfortunately, we are currently unable to run the demo using ZeroGPU. Mamba wheels can be installed correctly, but it seems that timm is throwing an error, which is strange because there would be no issues on a T4.

han1997 changed discussion status to open

@han1997 Thanks for testing ZeroGPU! Regarding the timm issue, is it a jit related error? If so, maybe doing the following fixes it.

import torch

torch.jit.script = lambda f: f
import timm

Recently, there was a similar issue related to ZeroGPU in another Space, and our infra team suggested this and it worked for them.
CUDA is not available outside of the function decorated with @spaces.GPU, so we cannot use JIT compile. But timm has some functions decorated with @torch.jit.script, so simply importing timm raises an error. However, the above code replaces torch.jit.script with a function that does nothing before importing timm, we can avoid the error.

@hysts Cool! That fixes the problem, thanks a lot!

han1997 changed discussion status to closed

Sign up or log in to comment