|
--- |
|
license: llama2 |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
--- |
|
# InvestLM |
|
This is the repo for a new financial domain large language model, InvestLM, tuned on [Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1), using a carefully curated instruction dataset related to financial investment. We provide guidance on how to use InvestLM for inference. |
|
|
|
Github Link: [InvestLM](https://github.com/AbaciNLP/InvestLM) |
|
|
|
<font color="#0000FF">Test only, not for sharing.</font> |
|
|
|
# About AWQ |
|
[AWQ](https://github.com/casper-hansen/AutoAWQ) is an efficient, accurate, and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference. |
|
|
|
# Inference |
|
Please use the following command to log in hugging face first. |
|
``` |
|
huggingface-cli login |
|
``` |
|
## Prompt template |
|
|
|
``` |
|
[INST] {prompt} [/INST] |
|
``` |
|
|
|
## How to use this AWQ model from Python code |
|
``` |
|
pip3 install --upgrade "autoawq>=0.1.6" "transformers>=4.35.0" |
|
``` |
|
|
|
``` |
|
from transformers import AutoTokenizer, TextStreamer |
|
|
|
quant_path = "yixuantt/InvestLM-Mistral-AWQ" |
|
|
|
# Load model |
|
model = AutoModelForCausalLM.from_pretrained( |
|
quant_path, |
|
low_cpu_mem_usage=True, |
|
device_map="cuda:0" |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained(quant_path) |
|
|
|
# Convert prompt to tokens |
|
prompt_template = "[INST] {prompt} [/INST]" |
|
prompt = "What is finance?" |
|
|
|
tokens = tokenizer( |
|
prompt_template.format(prompt=prompt), |
|
return_tensors='pt' |
|
).input_ids.cuda() |
|
|
|
# Generate output |
|
generation_output = model.generate( |
|
tokens, |
|
max_new_tokens = 512 |
|
) |
|
|
|
print("Output: ", tokenizer.decode(generation_output[0])) |
|
|
|
``` |