File size: 2,185 Bytes
45b08ee
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
license: apache-2.0
tags:
- control vectors
- exllamav2
- creative writing
- text generation
- inference
- model integration
---
# Creative Writing Control Vectors Integration for ExLlamaV2

This project provides a wrapper to integrate [jukofyork's creative writing control vectors](https://huggingface.co/jukofyork/creative-writing-control-vectors-v3.0) with [ExLlamaV2](https://github.com/turboderp/exllamav2/). While ExLlamaV2 does not natively support control vectors, this wrapper enables loading and injecting GGUF control vectors into the model for dynamic text generation control.

## Overview

- Wrapper for using control vectors with ExLlamaV2
- Supports loading control vectors from GGUF format
- Injects vectors directly into ExLlamaV2 inference
- Enables dynamic text generation control

## Usage

1. Download model in ExLlamaV2 format
2. Create a "-vectors" directory next to model directory
3. Download the control vectors from [jukofyork's repository](https://huggingface.co/jukofyork/creative-writing-control-vectors-v3.0) and place them in the "-vectors" directory.
4. Run inference with the `--control_vectors` (`-vc`) parameter.

Example command:
```bash
python test_inference.py -m Meta-Llama-3-70B-Instruct-8bpw \
  -p "<prompt>" \
  --control_vectors language:simple:0.5,optimism:optimism:0.5
```

## Directory Structure

Ensure your directory structure follows this format to correctly load the control vectors:

```
models/
  β”œβ”€β”€ Meta-Llama-3-70B-Instruct-8bpw/
  β”‚   └── model files...
  └── Meta-Llama-3-70B-Instruct-8bpw-vectors/
      β”œβ”€β”€ llama-3:70b-language__debias.gguf
      β”œβ”€β”€ llama-3:70b-language__simple.gguf
      β”œβ”€β”€ llama-3:70b-language__ornate.gguf 
      └── ...
```

## Limitations

- Proof of concept implementation
- May impact model performance
- Limited testing with different vector combinations
- No guarantee of exact equivalence to llama.cpp behavior

## Acknowledgments

- Control vectors from [jukofyork's creative-writing-control-vectors-v3.0](https://huggingface.co/jukofyork/creative-writing-control-vectors-v3.0)
- [ExLlamaV2 by turboderp](https://github.com/turboderp/exllamav2/)