--- language: - en - sw license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - gemma - trl base_model: unsloth/gemma-7b-bnb-4bit --- # Uploaded model - **Developed by:** Mollel - **License:** apache-2.0 - **Finetuned from model :** unsloth/gemma-7b-bnb-4bit This gemma model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. # Inference With Inference with HuggingFace transformers ```python3 !pip install transformers peft accelerate bitsandbytes from peft import AutoPeftModelForCausalLM from transformers import AutoTokenizer model = AutoPeftModelForCausalLM.from_pretrained( "Mollel/Gemma_Swahili_Mollel_1_epoch", load_in_4bit = False ) tokenizer = AutoTokenizer.from_pretrained("Mollel/Gemma_Swahili_Mollel_1_epoch") input_prompt = """ ### Instruction: {} ### Input: {} ### Response: {}""" input_text = input_prompt.format( "Andika aya fupi kuhusu mada iliyotolewa.", # instruction "Umuhimu wa kutumia nishati inayoweza kurejeshwa", # input "", # output - leave this blank for generation! ) inputs = tokenizer([input_text], return_tensors = "pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens = 300, use_cache = True) response = tokenizer.batch_decode(outputs)[0] print(response) ``` [](https://github.com/unslothai/unsloth)