Micro Llama v0 (Development)

Micro Llama v0 is a lightweight and experimental version of the LlamaForCausalLM model designed for development and testing purposes. This repository contains the necessary model configuration, tokenizer, and generation settings to run a minimal Llama architecture.

Model Overview

Micro Llama v0 is based on the LlamaForCausalLM architecture. It is tailored to fit resource-constrained environments for testing the foundational components of a transformer-based language model. This version features:

1 hidden layer
2048 hidden size
32 attention heads
5632 intermediate size
Max position embeddings of 2048
Vocabulary size of 32,000

These parameters make the model compact and suitable for development, while still maintaining key characteristics of the Llama architecture.

Files and Configuration

config.json: Contains the model architecture configuration, such as hidden size, number of attention heads, hidden layers, and activation functions.
generation_config.json: Specifies generation parameters, including max length and token behavior.
model.safetensors: Stores the model weights in a safe and efficient format.
special_tokens_map.json: Maps the special tokens used by the model, including <s>, </s>, <unk>, and </s> (for padding).
tokenizer.json: Defines the tokenizer configuration, including vocabulary size and token mapping.
tokenizer_config.json: Further configures the tokenizer, specifying token types, maximum sequence length, and other tokenizer options.

Requirements

Transformers version 4.44.0 or above
PyTorch version compatible with the model's float32 tensor type
safetensors package for loading model weights

Usage

Clone the repository:

git clone https://github.com/your-repo/micro-llama.git
cd micro-llama

Install the required dependencies:

pip install transformers safetensors torch

Load the model in your code:

from transformers import LlamaForCausalLM, LlamaTokenizer

tokenizer = LlamaTokenizer.from_pretrained("UnieAI-Wilson/micro-llama-0-dev")
model = LlamaForCausalLM.from_pretrained("UnieAI-Wilson/micro-llama-0-dev", torch_dtype="float16")

inputs = tokenizer("Your text here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

License

Micro Llama v0 is licensed under the Apache 2.0 License. See the LICENSE file for details.

Contribution

This is an experimental and evolving project. Contributions are welcome, and feel free to submit issues or pull requests.

Disclaimer

This is an early-stage development version, and the model may undergo significant changes. It is not intended for production use.