Model Card for Model ID
Instruct tuned AuroraGPT-7B model. Created from 2250 iterations (970/epoch) over the IT-v4 dataset (described below).
Usage
This model uses a pretty standart chat interface. Using the supplied tokenizer, you can convert from input messages:
messages = [{"role": "system", "content": <system_prompt>},{"role": "user", "content": <user_prompt>}]
to a chat-string using tokenizer.apply_chat_template(message)
.
Training Data
Trained on an aggregation of several datasets:
- open-phi/textbooks
- open-phi/programming_books_llama
- openchat/openchat_sharegpt4_dataset
- nvidia/ChatQA-Training-Data
- In-house 4o-mini reflect tuned fermi problems
- In-house 4o-mini reflect tuned theorem QA
- jeffmeloy/sonnet3.5_science_conversations
- HuggingFaceH4/ultrachat_200k
- microsoft/orca-math-word-problems-200k
- m-a-p/CodeFeedback-Filtered-Instruction
- teknium/OpenHermes-2.5
- openbmb/UltraInteract_sft
Training Procedure
Trained on 32 nodes of Polaris supercomputer using pytorch FSDP with Hybrid-shard:
- LR = 5x10^-5
- per-gpu batch size = 1
- Gradient accumulation = 6
- Global batch size = 768
- Downloads last month
- 78
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.