Fine-Tuned Qwen 2.5-Coder-1.5B is a causal language model fine-tuned for generating contextually relevant responses. The base model, Qwen/Qwen2.5-Coder-1.5B, features a Transformer-based architecture with 1.5 billion parameters. The model was fine-tuned on a custom dataset named subset5, consisting of prompt-response pairs tokenized with a maximum sequence length of 128 tokens. During training, inputs were padded and truncated appropriately, and labels were aligned for causal language modeling. Key hyperparameters included a learning rate of 2e-5, batch size of 1, gradient accumulation steps of 32, and 3 epochs. The AdamW optimizer was used, with weight decay set to 0.01. Training was performed on CPU without CUDA.
The model can be used for tasks like answering questions, completing sentences, or generating responses. For usage, load the model and tokenizer with the Hugging Face Transformers library, tokenize your input prompt, and generate responses with the model’s generate method. Example input-output pairs demonstrate the model’s ability to generate concise, informative answers. However, the model should not be used for harmful, malicious, or unethical content, and users are responsible for adhering to applicable laws and ethical standards.
- Downloads last month
- 13