--- tags: - mamba2 license: mit --- # mamba2-1.3b-av ## Introduction This is a mirror model to [mamba2-1.3b](https://huggingface.co/state-spaces/mamba2-1.3b) which is compatible with [mamba2-torch](https://github.com/vasqu/mamba2-torch), a Hugging Face compatible mamba2 library that is not dependent on the original cuda wheels of the [original mamba repo](https://github.com/state-spaces/mamba). Credit goes to the original authors of [Mamba2](https://arxiv.org/abs/2405.21060) and the [transformers](https://github.com/huggingface/transformers) library by Hugging Face. Without their work, this would not be possible. NOTE: `mamba2-torch` offers different optimisation paths to use: - Triton kernels and [causal-conv1d](https://github.com/Dao-AILab/causal-conv1d) ("fastest") - Triton kernels only (default) - Pure PyTorch ## How to Get Started with the Model You can follow the instructions in the [mamba2-torch repo](https://github.com/vasqu/mamba2-torch) for a more detailed explanation. First of all, you should install the mamba2-torch lib: ```bash git clone https://github.com/vasqu/mamba2-torch.git cd mamba2-torch pip install . ``` Then you can download this repository here via git lfs and then use the files locally the following way (after installing mamba2-torch): ```python from transformers import AutoTokenizer from mamba2_torch import Mamba2Model, Mamba2ForCausalLM, Mamba2Config device = "cuda" mamba2_hf_path = "" model = Mamba2ForCausalLM.from_pretrained(mamba2_hf_path, local_files_only=True).to(device) tokenizer = AutoTokenizer.from_pretrained(mamba2_hf_path, local_files_only=True) input_ids = tokenizer("Hey how are you doing?", return_tensors="pt")["input_ids"].to(device) # expected output (1.3b): `["Hey how are you doing? I'm doing good. I'm doing good."]` out = model.generate(input_ids, max_new_tokens=10) print(tokenizer.batch_decode(out)) ``` ## Citation **BibTeX:** ```bibtex @inproceedings{mamba2, title={Transformers are {SSM}s: Generalized Models and Efficient Algorithms Through Structured State Space Duality}, author={Dao, Tri and Gu, Albert}, booktitle={International Conference on Machine Learning (ICML)}, year={2024} } ```