File size: 2,740 Bytes
f521704 eb5206a 8a4e1e1 62fcd2e eb5206a 16fb8e9 64918a0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
---
license: llama2
---
# CodeBooga-34B-v0.1
This is a merge between the following two models:
1) [Phind-CodeLlama-34B-v2](https://huggingface.co/Phind/Phind-CodeLlama-34B-v2)
2) [WizardCoder-Python-34B-V1.0](https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0)
It was created with the [BlockMerge Gradient script](https://github.com/Gryphe/BlockMerge_Gradient), the same one that was used to create [MythoMax-L2-13b](https://huggingface.co/Gryphe/MythoMax-L2-13b), and with the same settings. The following YAML was used:
```yaml
model_path1: "Phind_Phind-CodeLlama-34B-v2_safetensors"
model_path2: "WizardLM_WizardCoder-Python-34B-V1.0_safetensors"
output_model_path: "CodeBooga-34B-v0.1"
operations:
- operation: lm_head # Single tensor
filter: "lm_head"
gradient_values: [0.75]
- operation: embed_tokens # Single tensor
filter: "embed_tokens"
gradient_values: [0.75]
- operation: self_attn
filter: "self_attn"
gradient_values: [0.75, 0.25]
- operation: mlp
filter: "mlp"
gradient_values: [0.25, 0.75]
- operation: layernorm
filter: "layernorm"
gradient_values: [0.5, 0.5]
- operation: modelnorm # Single tensor
filter: "model.norm"
gradient_values: [0.75]
```
## Prompt format
Both base models use the Alpaca format, so it should be used for this one as well.
```
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Your instruction
### Response:
Bot reply
### Instruction:
Another instruction
### Response:
Bot reply
```
## Evaluation
(This is not very scientific, so bear with me.)
I made a quick experiment where I asked a set of 3 Python and 3 Javascript questions (real world, difficult questions with nuance) to the following models:
1) This one
2) A second variant generated with `model_path1` and `model_path2` swapped in the YAML above, which I called CodeBooga-Reversed-34B-v0.1
3) WizardCoder-Python-34B-V1.0
4) Phind-CodeLlama-34B-v2
Specifically, I used 4.250b EXL2 quantizations of each. I then sorted the responses for each question by quality, and attributed the following scores:
* 4th place: 0
* 3rd place: 1
* 2nd place: 2
* 1st place: 4
The resulting cumulative scores were:
* CodeBooga-34B-v0.1: 22
* WizardCoder-Python-34B-V1.0: 12
* Phind-CodeLlama-34B-v2: 7
* CodeBooga-Reversed-34B-v0.1: 1
CodeBooga-34B-v0.1 performed very well, while its variant performed poorly, so I uploaded the former but not the latter.
## Quantized versions
### GGUF
TheBloke has kindly provided GGUF quantizations for llama.cpp:
https://huggingface.co/TheBloke/CodeBooga-34B-v0.1-GGUF
<a href="https://ko-fi.com/oobabooga"><img src="https://i.imgur.com/UJlEAYw.png"></a> |