wambugu71
/

SwahiliInstruct-v0.1-GGUF

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

SwahiliInstruct-v0.1-GGUF / README.md

Kennedy wambugu

Update README.md

1ae771d verified 9 months ago

|

history blame contribute delete

1.87 kB

	---
	license: mit
	language:
	- en
	- sw
	tags:
	- text-generation-inference
	---
	# SwahiliInstruct-v0.1-GGUF

	This repo contains models from LLM `SwahiliInstruct-v0.1` in GGUF format in quantization:
	- q_3_k_m
	- q_4_k_m
	- q_5_k_m

	## Provided files

	\| Name \| Quant method \| Bits \| Size \| Max RAM required \| Use case \|
	\| ---- \| ---- \| ---- \| ---- \| ---- \| ----- \|
	\| [swahiliinstruct-v0.1.Q3_K_M.gguf](https://huggingface.co/wambugu1738/SwahiliInstruct-v0.1-GGUF/blob/main/swahiliinstruct-v0.1.Q3_K_M.gguf) \| Q3_K_M \| 3 \| 3.52 GB\| 6.02 GB \| very small, high quality loss \|
	\| [swahiliinstruct-v0.1.Q4_K_M.gguf](https://huggingface.co/wambugu1738/SwahiliInstruct-v0.1-GGUF/blob/main/swahiliinstruct-v0.1.Q4_K_M.gguf) \| Q4_K_M \| 4 \| 4.37 GB\| 6.87 GB \| medium, balanced quality - recommended \|
	\|[swahiliinstruct-v0.1.Q5_K_M.gguf](https://huggingface.co/wambugu1738/SwahiliInstruct-v0.1-GGUF/blob/main/swahiliinstruct-v0.1.Q5_K_M.gguf) \| Q5_K_M \| 5 \| 5.13 GB\| 7.63 GB \| large, very low quality loss - recommended \|


	#loading the models on cpu
	- Installing the library
	``` Bash
	pip install llama_cpp_python
	```
	- Python code
	``` Python
	import llama_cpp
	model = llama_cpp.Llama(model_path="swahiliinstruct-v0.1.Q4_K_M.gguf",n_ctx=4096,n_threads=0,n_gpu_layers=-1,verbose=True, chat_format="chatml-function-calling")
	def model_out(prompt):
	outp = model.create_chat_completion(
	messages=[{"role": "system", "content": "You are a human like assistant."},{
	"role": "user",
	"content": f"{prompt}"
	}] ,stream=True,temperature=0.4, max_tokens=4096)
	return outp
	while True:
	prompt = input("\nUser:\n")
	for i in model_out(prompt):
	try:
	print(i['choices'][0][ 'delta']['role'])
	except:
	try:
	print(i['choices'][0][ 'delta']['content'],end="")
	except:
	pass
	```