--- library_name: transformers tags: - math - lora - science - chemistry - biology - code - text-generation-inference - unsloth - llama license: apache-2.0 datasets: - HuggingFaceTB/smoltalk language: - en - de - es - fr - it - pt - hi - th base_model: - meta-llama/Llama-3.2-1B-Instruct --- ![FastLlama-Logo](FastLlama.png) These are only LoRA adapters of [FastLlama-3.2-1B-Instruct](https://huggingface.co/suayptalha/FastLlama-3.2-1B-Instruct). You should also import the base model in order to use them! You can use ChatML & Alpaca format. You can chat with the model via this [space](https://huggingface.co/spaces/suayptalha/Chat-with-FastLlama). **Overview:** FastLlama is a highly optimized version of the Llama-3.2-1B-Instruct model. Designed for superior performance in constrained environments, it combines speed, compactness, and high accuracy. This version has been fine-tuned using the MetaMathQA-50k section of the HuggingFaceTB/smoltalk dataset to enhance its mathematical reasoning and problem-solving abilities. **Features:** Lightweight and Fast: Optimized to deliver Llama-class capabilities with reduced computational overhead. Fine-Tuned for Math Reasoning: Utilizes MetaMathQA-50k for better handling of complex mathematical problems and logical reasoning tasks. Instruction-Tuned: Pre-trained on instruction-following tasks, making it robust in understanding and executing detailed queries. Versatile Use Cases: Suitable for educational tools, tutoring systems, or any application requiring mathematical reasoning. **Performance Highlights:** Smaller Footprint: The model delivers comparable results to larger counterparts while operating efficiently on smaller hardware. Enhanced Accuracy: Demonstrates improved performance on mathematical QA benchmarks. Instruction Adherence: Retains high fidelity in understanding and following user instructions, even for complex queries. **Loading the Model:** ```py import torch from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline from peft import PeftModel, PeftConfig base_model_id = "meta-llama/Llama-3.2-1B-Instruct" # Base model ID adapter_id = "suayptalha/FastLlama-3.2-LoRA" # Adapter ID tokenizer = AutoTokenizer.from_pretrained(base_model_id) base_model = AutoModelForCausalLM.from_pretrained( base_model_id, torch_dtype=torch.bfloat16, device_map="auto" ) model = PeftModel.from_pretrained(base_model, adapter_id) # Text generation pipeline pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, device_map="auto", ) messages = [ {"role": "system", "content": "You are a friendly assistant named FastLlama."}, {"role": "user", "content": "Who are you?"}, ] outputs = pipe( messages, max_new_tokens=256, ) print(outputs[0]["generated_text"][-1]) ``` **Dataset:** Dataset: MetaMathQA-50k The MetaMathQA-50k subset of HuggingFaceTB/smoltalk was selected for fine-tuning due to its focus on mathematical reasoning, multi-step problem-solving, and logical inference. The dataset includes: Algebraic problems Geometric reasoning tasks Statistical and probabilistic questions Logical deduction problems **Model Fine-Tuning:** Fine-tuning was conducted using the following configuration: Learning Rate: 2e-4 Epochs: 1 Optimizer: AdamW Framework: Unsloth **License:** This model is licensed under the Apache 2.0 License. See the LICENSE file for details.