fix stop tokens to match new prompt formatting, stream instruct response, add comments about concurrency to config e0bf185 winglian commited on May 15, 2023
fix layout, max size back to 1, llama.cpp doesn't like parallel calls 80c7d2e winglian commited on May 15, 2023
try to fix combining gr.interface with blocks, try to increase concurrency on larger gpus dce6894 winglian commited on May 15, 2023
link model attributions, use config.yml for some of the chat settings, increase context size 1dc6c65 winglian commited on May 15, 2023
rm docker implementation, add llama-cpp-python builder github actions, update copy to identify model in ui e3ba05b winglian commited on May 15, 2023