|
--- |
|
license: apache-2.0 |
|
tags: |
|
- control vectors |
|
- exllamav2 |
|
- creative writing |
|
- text generation |
|
- inference |
|
- model integration |
|
--- |
|
# Creative Writing Control Vectors Integration for ExLlamaV2 |
|
|
|
This project provides a wrapper to integrate [jukofyork's creative writing control vectors](https://huggingface.co/jukofyork/creative-writing-control-vectors-v3.0) with [ExLlamaV2](https://github.com/turboderp/exllamav2/). While ExLlamaV2 does not natively support control vectors, this wrapper enables loading and injecting GGUF control vectors into the model for dynamic text generation control. |
|
|
|
## Overview |
|
|
|
- Wrapper for using control vectors with ExLlamaV2 |
|
- Supports loading control vectors from GGUF format |
|
- Injects vectors directly into ExLlamaV2 inference |
|
- Enables dynamic text generation control |
|
|
|
## Usage |
|
|
|
1. Download model in ExLlamaV2 format |
|
2. Create a "-vectors" directory next to model directory |
|
3. Download the control vectors from [jukofyork's repository](https://huggingface.co/jukofyork/creative-writing-control-vectors-v3.0) and place them in the "-vectors" directory. |
|
4. Run inference with the `--control_vectors` (`-vc`) parameter. |
|
|
|
Example command: |
|
```bash |
|
python test_inference.py -m Meta-Llama-3-70B-Instruct-8bpw \ |
|
-p "<prompt>" \ |
|
--control_vectors language:simple:0.5,optimism:optimism:0.5 |
|
``` |
|
|
|
## Directory Structure |
|
|
|
Ensure your directory structure follows this format to correctly load the control vectors: |
|
|
|
``` |
|
models/ |
|
βββ Meta-Llama-3-70B-Instruct-8bpw/ |
|
β βββ model files... |
|
βββ Meta-Llama-3-70B-Instruct-8bpw-vectors/ |
|
βββ llama-3:70b-language__debias.gguf |
|
βββ llama-3:70b-language__simple.gguf |
|
βββ llama-3:70b-language__ornate.gguf |
|
βββ ... |
|
``` |
|
|
|
## Limitations |
|
|
|
- Proof of concept implementation |
|
- May impact model performance |
|
- Limited testing with different vector combinations |
|
- No guarantee of exact equivalence to llama.cpp behavior |
|
|
|
## Acknowledgments |
|
|
|
- Control vectors from [jukofyork's creative-writing-control-vectors-v3.0](https://huggingface.co/jukofyork/creative-writing-control-vectors-v3.0) |
|
- [ExLlamaV2 by turboderp](https://github.com/turboderp/exllamav2/) |