DUS Forty Layer Merged Model

Overview

The DUS Forty Layer Merged Model leverages a unique layer interlocking strategy, combining layers from the Llama-2-13B and Mistral-7B architectures. This approach optimizes computational efficiency while maintaining competitive performance across various natural language processing tasks.

Model Details

  • Architecture: Based on Llama-2-13B and Mistral-7B
  • Layer Arrangement: The forty configuration merges layers from both models, interlocking layers 0โ€“20 with layers 12โ€“32.
  • Tokenizer: Mistral-7B tokenizer is used for encoding and decoding.

Training Details

Downloads last month
11
Safetensors
Model size
8.99B params
Tensor type
FP16
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.