Spaces:
Running
Modular backends & support for openAI & AWS endpoints (#541)
Browse files* Fix the response
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Should use /completions
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Use async generator
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Use openai npm
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Fix generateFromDefaultEndpoint
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Fix last char become undefined
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Better support for system prompt
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Updates
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Revert
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Update README
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Default system prompt
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* remove sk-
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Fixing types
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Fix lockfile
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Move .optional
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Add try...catch and controller.error(error)
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* baseURL
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Format
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Fix types
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Fix again
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Better error message
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Update README
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
* Refactor backend to add support for modular backends
* readme fix
* readme update
* add support for lambda on aws endpoint
* upsate doc for lambda support
* fix typecheck
* make imports really optional
* readme fixes
* make endpoint creator async
* Update README.md
Co-authored-by: Henry Chen <[email protected]>
* Update README.md
Co-authored-by: Henry Chen <[email protected]>
* Update src/lib/server/endpoints/openai/endpointOai.ts
Co-authored-by: Henry Chen <[email protected]>
* trailing comma
* Update README.md
Co-authored-by: Mishig <[email protected]>
* change readme example name
* Update src/lib/server/models.ts
Co-authored-by: Eliott C. <[email protected]>
* fixed preprompt to use conversation.preprompt
* Make openAI endpoint compatible with Azure OpenAI
* surface errors in generation
* Added support for llamacpp endpoint
* fix llamacpp endpoint so it properly stops
* Add llamacpp example to readme
* Add support for legacy configs
---------
Signed-off-by: Hung-Han (Henry) Chen <[email protected]>
Co-authored-by: Hung-Han (Henry) Chen <[email protected]>
Co-authored-by: Henry Chen <[email protected]>
Co-authored-by: Mishig <[email protected]>
Co-authored-by: Eliott C. <[email protected]>
- .env +1 -0
- README.md +111 -4
- package-lock.json +227 -2
- package.json +4 -1
- src/lib/server/endpoints/aws/endpointAws.ts +64 -0
- src/lib/server/endpoints/endpoints.ts +42 -0
- src/lib/server/endpoints/llamacpp/endpointLlamacpp.ts +100 -0
- src/lib/server/endpoints/openai/endpointOai.ts +82 -0
- src/lib/server/endpoints/openai/openAIChatToTextGenerationStream.ts +32 -0
- src/lib/server/endpoints/openai/openAICompletionToTextGenerationStream.ts +32 -0
- src/lib/server/endpoints/tgi/endpointTgi.ts +37 -0
- src/lib/server/generateFromDefaultEndpoint.ts +24 -106
- src/lib/server/modelEndpoint.ts +0 -50
- src/lib/server/models.ts +57 -41
- src/lib/server/summarize.ts +2 -7
- src/lib/server/websearch/generateQuery.ts +2 -5
- src/lib/utils/trimPrefix.ts +0 -6
- src/lib/utils/trimSuffix.ts +0 -6
- src/routes/conversation/[id]/+page.svelte +2 -0
- src/routes/conversation/[id]/+server.ts +75 -130
@@ -8,6 +8,7 @@ MONGODB_DIRECT_CONNECTION=false
|
|
8 |
COOKIE_NAME=hf-chat
|
9 |
HF_ACCESS_TOKEN=#hf_<token> from from https://huggingface.co/settings/token
|
10 |
HF_API_ROOT=https://api-inference.huggingface.co/models
|
|
|
11 |
|
12 |
# used to activate search with web functionality. disabled if none are defined. choose one of the following:
|
13 |
YDC_API_KEY=#your docs.you.com api key here
|
|
|
8 |
COOKIE_NAME=hf-chat
|
9 |
HF_ACCESS_TOKEN=#hf_<token> from from https://huggingface.co/settings/token
|
10 |
HF_API_ROOT=https://api-inference.huggingface.co/models
|
11 |
+
OPENAI_API_KEY=#your openai api key here
|
12 |
|
13 |
# used to activate search with web functionality. disabled if none are defined. choose one of the following:
|
14 |
YDC_API_KEY=#your docs.you.com api key here
|
@@ -168,6 +168,91 @@ MODELS=`[
|
|
168 |
|
169 |
You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.
|
170 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
171 |
#### Custom prompt templates
|
172 |
|
173 |
By default, the prompt is constructed using `userMessageToken`, `assistantMessageToken`, `userMessageEndToken`, `assistantMessageEndToken`, `preprompt` parameters and a series of default templates.
|
@@ -258,23 +343,45 @@ You can then add the generated information and the `authorization` parameter to
|
|
258 |
]
|
259 |
```
|
260 |
|
261 |
-
### Amazon
|
|
|
|
|
262 |
|
263 |
You can also specify your Amazon SageMaker instance as an endpoint for chat-ui. The config goes like this:
|
264 |
|
265 |
```env
|
266 |
"endpoints": [
|
267 |
{
|
268 |
-
"
|
269 |
-
"
|
|
|
270 |
"accessKey": "",
|
271 |
"secretKey" : "",
|
272 |
-
"sessionToken": "",
|
273 |
"weight": 1
|
274 |
}
|
275 |
]
|
276 |
```
|
277 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
278 |
You can get the `accessKey` and `secretKey` from your AWS user, under programmatic access.
|
279 |
|
280 |
#### Client Certificate Authentication (mTLS)
|
|
|
168 |
|
169 |
You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.
|
170 |
|
171 |
+
#### OpenAI API compatible models
|
172 |
+
|
173 |
+
Chat UI can be used with any API server that supports OpenAI API compatibility, for example [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), [LocalAI](https://github.com/go-skynet/LocalAI), [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), and [ialacol](https://github.com/chenhunghan/ialacol).
|
174 |
+
|
175 |
+
The following example config makes Chat UI works with [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), the `endpoint.baseUrl` is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The `endpoint.completion` determine which endpoint to be used, default is `chat_completions` which uses `v1/chat/completions`, change to `endpoint.completion` to `completions` to use the `v1/completions` endpoint.
|
176 |
+
|
177 |
+
```
|
178 |
+
MODELS=`[
|
179 |
+
{
|
180 |
+
"name": "text-generation-webui",
|
181 |
+
"id": "text-generation-webui",
|
182 |
+
"parameters": {
|
183 |
+
"temperature": 0.9,
|
184 |
+
"top_p": 0.95,
|
185 |
+
"repetition_penalty": 1.2,
|
186 |
+
"top_k": 50,
|
187 |
+
"truncate": 1000,
|
188 |
+
"max_new_tokens": 1024,
|
189 |
+
"stop": []
|
190 |
+
},
|
191 |
+
"endpoints": [{
|
192 |
+
"type" : "openai",
|
193 |
+
"baseURL": "http://localhost:8000/v1"
|
194 |
+
}]
|
195 |
+
}
|
196 |
+
]`
|
197 |
+
|
198 |
+
```
|
199 |
+
|
200 |
+
The `openai` type includes official OpenAI models. You can add, for example, GPT4/GPT3.5 as a "openai" model:
|
201 |
+
|
202 |
+
```
|
203 |
+
OPENAI_API_KEY=#your openai api key here
|
204 |
+
MODELS=`[{
|
205 |
+
"name": "gpt-4",
|
206 |
+
"displayName": "GPT 4",
|
207 |
+
"endpoints" : [{
|
208 |
+
"type": "openai"
|
209 |
+
}]
|
210 |
+
},
|
211 |
+
{
|
212 |
+
"name": "gpt-3.5-turbo",
|
213 |
+
"displayName": "GPT 3.5 Turbo",
|
214 |
+
"endpoints" : [{
|
215 |
+
"type": "openai"
|
216 |
+
}]
|
217 |
+
}]`
|
218 |
+
```
|
219 |
+
|
220 |
+
#### Llama.cpp API server
|
221 |
+
|
222 |
+
chat-ui also supports the llama.cpp API server directly without the need for an adapter. You can do this using the `llamacpp` endpoint type.
|
223 |
+
|
224 |
+
If you want to run chat-ui with llama.cpp, you can do the following, using Zephyr as an example model:
|
225 |
+
|
226 |
+
1. Get [the weights](https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/tree/main) from the hub
|
227 |
+
2. Run the server with the following command: `./server -m models/zephyr-7b-beta.Q4_K_M.gguf -c 2048 -np 3`
|
228 |
+
3. Add the following to your `.env.local`:
|
229 |
+
|
230 |
+
```env
|
231 |
+
MODELS=[
|
232 |
+
{
|
233 |
+
"name": "Local Zephyr",
|
234 |
+
"chatPromptTemplate": "<|system|>\n{{preprompt}}</s>\n{{#each messages}}{{#ifUser}}<|user|>\n{{content}}</s>\n<|assistant|>\n{{/ifUser}}{{#ifAssistant}}{{content}}</s>\n{{/ifAssistant}}{{/each}}",
|
235 |
+
"parameters": {
|
236 |
+
"temperature": 0.1,
|
237 |
+
"top_p": 0.95,
|
238 |
+
"repetition_penalty": 1.2,
|
239 |
+
"top_k": 50,
|
240 |
+
"truncate": 1000,
|
241 |
+
"max_new_tokens": 2048,
|
242 |
+
"stop": ["</s>"]
|
243 |
+
},
|
244 |
+
"endpoints": [
|
245 |
+
{
|
246 |
+
"url": "http://127.0.0.1:8080",
|
247 |
+
"type": "llamacpp"
|
248 |
+
}
|
249 |
+
]
|
250 |
+
}
|
251 |
+
]
|
252 |
+
```
|
253 |
+
|
254 |
+
Start chat-ui with `npm run dev` and you should be able to chat with Zephyr locally.
|
255 |
+
|
256 |
#### Custom prompt templates
|
257 |
|
258 |
By default, the prompt is constructed using `userMessageToken`, `assistantMessageToken`, `userMessageEndToken`, `assistantMessageEndToken`, `preprompt` parameters and a series of default templates.
|
|
|
343 |
]
|
344 |
```
|
345 |
|
346 |
+
### Amazon
|
347 |
+
|
348 |
+
#### SageMaker
|
349 |
|
350 |
You can also specify your Amazon SageMaker instance as an endpoint for chat-ui. The config goes like this:
|
351 |
|
352 |
```env
|
353 |
"endpoints": [
|
354 |
{
|
355 |
+
"type" : "aws",
|
356 |
+
"service" : "sagemaker"
|
357 |
+
"url": "",
|
358 |
"accessKey": "",
|
359 |
"secretKey" : "",
|
360 |
+
"sessionToken": "",
|
361 |
"weight": 1
|
362 |
}
|
363 |
]
|
364 |
```
|
365 |
|
366 |
+
#### Lambda
|
367 |
+
|
368 |
+
You can also specify your Amazon Lambda instance as an endpoint for chat-ui. The config goes like this:
|
369 |
+
|
370 |
+
```env
|
371 |
+
"endpoints" : [
|
372 |
+
{
|
373 |
+
"type": "aws",
|
374 |
+
"service": "lambda",
|
375 |
+
"url": "",
|
376 |
+
"accessKey": "",
|
377 |
+
"secretKey": "",
|
378 |
+
"sessionToken": "",
|
379 |
+
"region": "",
|
380 |
+
"weight": 1
|
381 |
+
}
|
382 |
+
]
|
383 |
+
```
|
384 |
+
|
385 |
You can get the `accessKey` and `secretKey` from your AWS user, under programmatic access.
|
386 |
|
387 |
#### Client Certificate Authentication (mTLS)
|
@@ -12,7 +12,6 @@
|
|
12 |
"@huggingface/inference": "^2.6.3",
|
13 |
"@xenova/transformers": "^2.6.0",
|
14 |
"autoprefixer": "^10.4.14",
|
15 |
-
"aws4fetch": "^1.0.17",
|
16 |
"date-fns": "^2.29.3",
|
17 |
"dotenv": "^16.0.3",
|
18 |
"handlebars": "^4.7.8",
|
@@ -55,6 +54,10 @@
|
|
55 |
"unplugin-icons": "^0.16.1",
|
56 |
"vite": "^4.3.9",
|
57 |
"vitest": "^0.31.0"
|
|
|
|
|
|
|
|
|
58 |
}
|
59 |
},
|
60 |
"node_modules/@ampproject/remapping": {
|
@@ -1120,6 +1123,16 @@
|
|
1120 |
"resolved": "https://registry.npmjs.org/@types/node/-/node-18.13.0.tgz",
|
1121 |
"integrity": "sha512-gC3TazRzGoOnoKAhUx+Q0t8S9Tzs74z7m0ipwGpSqQrleP14hKxP4/JUeEQcD3W1/aIpnWl8pHowI7WokuZpXg=="
|
1122 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1123 |
"node_modules/@types/node-int64": {
|
1124 |
"version": "0.4.29",
|
1125 |
"resolved": "https://registry.npmjs.org/@types/node-int64/-/node-int64-0.4.29.tgz",
|
@@ -1478,6 +1491,18 @@
|
|
1478 |
"resolved": "https://registry.npmjs.org/abab/-/abab-2.0.6.tgz",
|
1479 |
"integrity": "sha512-j2afSsaIENvHZN2B8GOpF566vZ5WVk5opAiMTvWgaQT8DkbOqsTfvNAvHoRGU2zzP8cPoqys+xHTRDWW8L+/BA=="
|
1480 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1481 |
"node_modules/acorn": {
|
1482 |
"version": "8.10.0",
|
1483 |
"resolved": "https://registry.npmjs.org/acorn/-/acorn-8.10.0.tgz",
|
@@ -1519,6 +1544,18 @@
|
|
1519 |
"node": ">= 6.0.0"
|
1520 |
}
|
1521 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1522 |
"node_modules/ajv": {
|
1523 |
"version": "6.12.6",
|
1524 |
"resolved": "https://registry.npmjs.org/ajv/-/ajv-6.12.6.tgz",
|
@@ -1654,7 +1691,8 @@
|
|
1654 |
"node_modules/aws4fetch": {
|
1655 |
"version": "1.0.17",
|
1656 |
"resolved": "https://registry.npmjs.org/aws4fetch/-/aws4fetch-1.0.17.tgz",
|
1657 |
-
"integrity": "sha512-4IbOvsxqxeOSxI4oA+8xEO8SzBMVlzbSTgGy/EF83rHnQ/aKtP6Sc6YV/k0oiW0mqrcxuThlbDosnvetGOuO+g=="
|
|
|
1658 |
},
|
1659 |
"node_modules/axobject-query": {
|
1660 |
"version": "3.2.1",
|
@@ -1675,6 +1713,12 @@
|
|
1675 |
"resolved": "https://registry.npmjs.org/balanced-match/-/balanced-match-1.0.2.tgz",
|
1676 |
"integrity": "sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw=="
|
1677 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
1678 |
"node_modules/base64-js": {
|
1679 |
"version": "1.5.1",
|
1680 |
"resolved": "https://registry.npmjs.org/base64-js/-/base64-js-1.5.1.tgz",
|
@@ -1924,6 +1968,15 @@
|
|
1924 |
"url": "https://github.com/chalk/chalk?sponsor=1"
|
1925 |
}
|
1926 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1927 |
"node_modules/check-error": {
|
1928 |
"version": "1.0.2",
|
1929 |
"resolved": "https://registry.npmjs.org/check-error/-/check-error-1.0.2.tgz",
|
@@ -2112,6 +2165,15 @@
|
|
2112 |
"node": ">= 8"
|
2113 |
}
|
2114 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2115 |
"node_modules/css-tree": {
|
2116 |
"version": "2.3.1",
|
2117 |
"resolved": "https://registry.npmjs.org/css-tree/-/css-tree-2.3.1.tgz",
|
@@ -2331,6 +2393,16 @@
|
|
2331 |
"node": ">=0.3.1"
|
2332 |
}
|
2333 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2334 |
"node_modules/dir-glob": {
|
2335 |
"version": "3.0.1",
|
2336 |
"resolved": "https://registry.npmjs.org/dir-glob/-/dir-glob-3.0.1.tgz",
|
@@ -2683,6 +2755,15 @@
|
|
2683 |
"node": ">=0.10.0"
|
2684 |
}
|
2685 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2686 |
"node_modules/execa": {
|
2687 |
"version": "5.1.1",
|
2688 |
"resolved": "https://registry.npmjs.org/execa/-/execa-5.1.1.tgz",
|
@@ -2853,6 +2934,25 @@
|
|
2853 |
"node": ">= 6"
|
2854 |
}
|
2855 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2856 |
"node_modules/fraction.js": {
|
2857 |
"version": "4.2.0",
|
2858 |
"resolved": "https://registry.npmjs.org/fraction.js/-/fraction.js-4.2.0.tgz",
|
@@ -3118,6 +3218,15 @@
|
|
3118 |
"node": ">=10.17.0"
|
3119 |
}
|
3120 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3121 |
"node_modules/iconv-lite": {
|
3122 |
"version": "0.6.3",
|
3123 |
"resolved": "https://registry.npmjs.org/iconv-lite/-/iconv-lite-0.6.3.tgz",
|
@@ -3227,6 +3336,12 @@
|
|
3227 |
"node": ">=8"
|
3228 |
}
|
3229 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
3230 |
"node_modules/is-builtin-module": {
|
3231 |
"version": "3.2.1",
|
3232 |
"resolved": "https://registry.npmjs.org/is-builtin-module/-/is-builtin-module-3.2.1.tgz",
|
@@ -3662,6 +3777,17 @@
|
|
3662 |
"marked": ">=4 <10"
|
3663 |
}
|
3664 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3665 |
"node_modules/md5-hex": {
|
3666 |
"version": "3.0.1",
|
3667 |
"resolved": "https://registry.npmjs.org/md5-hex/-/md5-hex-3.0.1.tgz",
|
@@ -3939,6 +4065,67 @@
|
|
3939 |
"resolved": "https://registry.npmjs.org/node-addon-api/-/node-addon-api-6.1.0.tgz",
|
3940 |
"integrity": "sha512-+eawOlIgy680F0kBzPUNFhMZGtJ1YmqM6l4+Crf4IkImjYrO/mqPwRMh352g23uIaQKFItcQ64I7KMaJxHgAVA=="
|
3941 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3942 |
"node_modules/node-gyp-build": {
|
3943 |
"version": "4.6.1",
|
3944 |
"resolved": "https://registry.npmjs.org/node-gyp-build/-/node-gyp-build-4.6.1.tgz",
|
@@ -4089,6 +4276,35 @@
|
|
4089 |
"platform": "^1.3.6"
|
4090 |
}
|
4091 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4092 |
"node_modules/openid-client": {
|
4093 |
"version": "5.4.2",
|
4094 |
"resolved": "https://registry.npmjs.org/openid-client/-/openid-client-5.4.2.tgz",
|
@@ -6260,6 +6476,15 @@
|
|
6260 |
"node": ">=14"
|
6261 |
}
|
6262 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6263 |
"node_modules/webidl-conversions": {
|
6264 |
"version": "7.0.0",
|
6265 |
"resolved": "https://registry.npmjs.org/webidl-conversions/-/webidl-conversions-7.0.0.tgz",
|
|
|
12 |
"@huggingface/inference": "^2.6.3",
|
13 |
"@xenova/transformers": "^2.6.0",
|
14 |
"autoprefixer": "^10.4.14",
|
|
|
15 |
"date-fns": "^2.29.3",
|
16 |
"dotenv": "^16.0.3",
|
17 |
"handlebars": "^4.7.8",
|
|
|
54 |
"unplugin-icons": "^0.16.1",
|
55 |
"vite": "^4.3.9",
|
56 |
"vitest": "^0.31.0"
|
57 |
+
},
|
58 |
+
"optionalDependencies": {
|
59 |
+
"aws4fetch": "^1.0.17",
|
60 |
+
"openai": "^4.14.2"
|
61 |
}
|
62 |
},
|
63 |
"node_modules/@ampproject/remapping": {
|
|
|
1123 |
"resolved": "https://registry.npmjs.org/@types/node/-/node-18.13.0.tgz",
|
1124 |
"integrity": "sha512-gC3TazRzGoOnoKAhUx+Q0t8S9Tzs74z7m0ipwGpSqQrleP14hKxP4/JUeEQcD3W1/aIpnWl8pHowI7WokuZpXg=="
|
1125 |
},
|
1126 |
+
"node_modules/@types/node-fetch": {
|
1127 |
+
"version": "2.6.5",
|
1128 |
+
"resolved": "https://registry.npmjs.org/@types/node-fetch/-/node-fetch-2.6.5.tgz",
|
1129 |
+
"integrity": "sha512-OZsUlr2nxvkqUFLSaY2ZbA+P1q22q+KrlxWOn/38RX+u5kTkYL2mTujEpzUhGkS+K/QCYp9oagfXG39XOzyySg==",
|
1130 |
+
"optional": true,
|
1131 |
+
"dependencies": {
|
1132 |
+
"@types/node": "*",
|
1133 |
+
"form-data": "^4.0.0"
|
1134 |
+
}
|
1135 |
+
},
|
1136 |
"node_modules/@types/node-int64": {
|
1137 |
"version": "0.4.29",
|
1138 |
"resolved": "https://registry.npmjs.org/@types/node-int64/-/node-int64-0.4.29.tgz",
|
|
|
1491 |
"resolved": "https://registry.npmjs.org/abab/-/abab-2.0.6.tgz",
|
1492 |
"integrity": "sha512-j2afSsaIENvHZN2B8GOpF566vZ5WVk5opAiMTvWgaQT8DkbOqsTfvNAvHoRGU2zzP8cPoqys+xHTRDWW8L+/BA=="
|
1493 |
},
|
1494 |
+
"node_modules/abort-controller": {
|
1495 |
+
"version": "3.0.0",
|
1496 |
+
"resolved": "https://registry.npmjs.org/abort-controller/-/abort-controller-3.0.0.tgz",
|
1497 |
+
"integrity": "sha512-h8lQ8tacZYnR3vNQTgibj+tODHI5/+l06Au2Pcriv/Gmet0eaj4TwWH41sO9wnHDiQsEj19q0drzdWdeAHtweg==",
|
1498 |
+
"optional": true,
|
1499 |
+
"dependencies": {
|
1500 |
+
"event-target-shim": "^5.0.0"
|
1501 |
+
},
|
1502 |
+
"engines": {
|
1503 |
+
"node": ">=6.5"
|
1504 |
+
}
|
1505 |
+
},
|
1506 |
"node_modules/acorn": {
|
1507 |
"version": "8.10.0",
|
1508 |
"resolved": "https://registry.npmjs.org/acorn/-/acorn-8.10.0.tgz",
|
|
|
1544 |
"node": ">= 6.0.0"
|
1545 |
}
|
1546 |
},
|
1547 |
+
"node_modules/agentkeepalive": {
|
1548 |
+
"version": "4.5.0",
|
1549 |
+
"resolved": "https://registry.npmjs.org/agentkeepalive/-/agentkeepalive-4.5.0.tgz",
|
1550 |
+
"integrity": "sha512-5GG/5IbQQpC9FpkRGsSvZI5QYeSCzlJHdpBQntCsuTOxhKD8lqKhrleg2Yi7yvMIf82Ycmmqln9U8V9qwEiJew==",
|
1551 |
+
"optional": true,
|
1552 |
+
"dependencies": {
|
1553 |
+
"humanize-ms": "^1.2.1"
|
1554 |
+
},
|
1555 |
+
"engines": {
|
1556 |
+
"node": ">= 8.0.0"
|
1557 |
+
}
|
1558 |
+
},
|
1559 |
"node_modules/ajv": {
|
1560 |
"version": "6.12.6",
|
1561 |
"resolved": "https://registry.npmjs.org/ajv/-/ajv-6.12.6.tgz",
|
|
|
1691 |
"node_modules/aws4fetch": {
|
1692 |
"version": "1.0.17",
|
1693 |
"resolved": "https://registry.npmjs.org/aws4fetch/-/aws4fetch-1.0.17.tgz",
|
1694 |
+
"integrity": "sha512-4IbOvsxqxeOSxI4oA+8xEO8SzBMVlzbSTgGy/EF83rHnQ/aKtP6Sc6YV/k0oiW0mqrcxuThlbDosnvetGOuO+g==",
|
1695 |
+
"optional": true
|
1696 |
},
|
1697 |
"node_modules/axobject-query": {
|
1698 |
"version": "3.2.1",
|
|
|
1713 |
"resolved": "https://registry.npmjs.org/balanced-match/-/balanced-match-1.0.2.tgz",
|
1714 |
"integrity": "sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw=="
|
1715 |
},
|
1716 |
+
"node_modules/base-64": {
|
1717 |
+
"version": "0.1.0",
|
1718 |
+
"resolved": "https://registry.npmjs.org/base-64/-/base-64-0.1.0.tgz",
|
1719 |
+
"integrity": "sha512-Y5gU45svrR5tI2Vt/X9GPd3L0HNIKzGu202EjxrXMpuc2V2CiKgemAbUUsqYmZJvPtCXoUKjNZwBJzsNScUbXA==",
|
1720 |
+
"optional": true
|
1721 |
+
},
|
1722 |
"node_modules/base64-js": {
|
1723 |
"version": "1.5.1",
|
1724 |
"resolved": "https://registry.npmjs.org/base64-js/-/base64-js-1.5.1.tgz",
|
|
|
1968 |
"url": "https://github.com/chalk/chalk?sponsor=1"
|
1969 |
}
|
1970 |
},
|
1971 |
+
"node_modules/charenc": {
|
1972 |
+
"version": "0.0.2",
|
1973 |
+
"resolved": "https://registry.npmjs.org/charenc/-/charenc-0.0.2.tgz",
|
1974 |
+
"integrity": "sha512-yrLQ/yVUFXkzg7EDQsPieE/53+0RlaWTs+wBrvW36cyilJ2SaDWfl4Yj7MtLTXleV9uEKefbAGUPv2/iWSooRA==",
|
1975 |
+
"optional": true,
|
1976 |
+
"engines": {
|
1977 |
+
"node": "*"
|
1978 |
+
}
|
1979 |
+
},
|
1980 |
"node_modules/check-error": {
|
1981 |
"version": "1.0.2",
|
1982 |
"resolved": "https://registry.npmjs.org/check-error/-/check-error-1.0.2.tgz",
|
|
|
2165 |
"node": ">= 8"
|
2166 |
}
|
2167 |
},
|
2168 |
+
"node_modules/crypt": {
|
2169 |
+
"version": "0.0.2",
|
2170 |
+
"resolved": "https://registry.npmjs.org/crypt/-/crypt-0.0.2.tgz",
|
2171 |
+
"integrity": "sha512-mCxBlsHFYh9C+HVpiEacem8FEBnMXgU9gy4zmNC+SXAZNB/1idgp/aulFJ4FgCi7GPEVbfyng092GqL2k2rmow==",
|
2172 |
+
"optional": true,
|
2173 |
+
"engines": {
|
2174 |
+
"node": "*"
|
2175 |
+
}
|
2176 |
+
},
|
2177 |
"node_modules/css-tree": {
|
2178 |
"version": "2.3.1",
|
2179 |
"resolved": "https://registry.npmjs.org/css-tree/-/css-tree-2.3.1.tgz",
|
|
|
2393 |
"node": ">=0.3.1"
|
2394 |
}
|
2395 |
},
|
2396 |
+
"node_modules/digest-fetch": {
|
2397 |
+
"version": "1.3.0",
|
2398 |
+
"resolved": "https://registry.npmjs.org/digest-fetch/-/digest-fetch-1.3.0.tgz",
|
2399 |
+
"integrity": "sha512-CGJuv6iKNM7QyZlM2T3sPAdZWd/p9zQiRNS9G+9COUCwzWFTs0Xp8NF5iePx7wtvhDykReiRRrSeNb4oMmB8lA==",
|
2400 |
+
"optional": true,
|
2401 |
+
"dependencies": {
|
2402 |
+
"base-64": "^0.1.0",
|
2403 |
+
"md5": "^2.3.0"
|
2404 |
+
}
|
2405 |
+
},
|
2406 |
"node_modules/dir-glob": {
|
2407 |
"version": "3.0.1",
|
2408 |
"resolved": "https://registry.npmjs.org/dir-glob/-/dir-glob-3.0.1.tgz",
|
|
|
2755 |
"node": ">=0.10.0"
|
2756 |
}
|
2757 |
},
|
2758 |
+
"node_modules/event-target-shim": {
|
2759 |
+
"version": "5.0.1",
|
2760 |
+
"resolved": "https://registry.npmjs.org/event-target-shim/-/event-target-shim-5.0.1.tgz",
|
2761 |
+
"integrity": "sha512-i/2XbnSz/uxRCU6+NdVJgKWDTM427+MqYbkQzD321DuCQJUqOuJKIA0IM2+W2xtYHdKOmZ4dR6fExsd4SXL+WQ==",
|
2762 |
+
"optional": true,
|
2763 |
+
"engines": {
|
2764 |
+
"node": ">=6"
|
2765 |
+
}
|
2766 |
+
},
|
2767 |
"node_modules/execa": {
|
2768 |
"version": "5.1.1",
|
2769 |
"resolved": "https://registry.npmjs.org/execa/-/execa-5.1.1.tgz",
|
|
|
2934 |
"node": ">= 6"
|
2935 |
}
|
2936 |
},
|
2937 |
+
"node_modules/form-data-encoder": {
|
2938 |
+
"version": "1.7.2",
|
2939 |
+
"resolved": "https://registry.npmjs.org/form-data-encoder/-/form-data-encoder-1.7.2.tgz",
|
2940 |
+
"integrity": "sha512-qfqtYan3rxrnCk1VYaA4H+Ms9xdpPqvLZa6xmMgFvhO32x7/3J/ExcTd6qpxM0vH2GdMI+poehyBZvqfMTto8A==",
|
2941 |
+
"optional": true
|
2942 |
+
},
|
2943 |
+
"node_modules/formdata-node": {
|
2944 |
+
"version": "4.4.1",
|
2945 |
+
"resolved": "https://registry.npmjs.org/formdata-node/-/formdata-node-4.4.1.tgz",
|
2946 |
+
"integrity": "sha512-0iirZp3uVDjVGt9p49aTaqjk84TrglENEDuqfdlZQ1roC9CWlPk6Avf8EEnZNcAqPonwkG35x4n3ww/1THYAeQ==",
|
2947 |
+
"optional": true,
|
2948 |
+
"dependencies": {
|
2949 |
+
"node-domexception": "1.0.0",
|
2950 |
+
"web-streams-polyfill": "4.0.0-beta.3"
|
2951 |
+
},
|
2952 |
+
"engines": {
|
2953 |
+
"node": ">= 12.20"
|
2954 |
+
}
|
2955 |
+
},
|
2956 |
"node_modules/fraction.js": {
|
2957 |
"version": "4.2.0",
|
2958 |
"resolved": "https://registry.npmjs.org/fraction.js/-/fraction.js-4.2.0.tgz",
|
|
|
3218 |
"node": ">=10.17.0"
|
3219 |
}
|
3220 |
},
|
3221 |
+
"node_modules/humanize-ms": {
|
3222 |
+
"version": "1.2.1",
|
3223 |
+
"resolved": "https://registry.npmjs.org/humanize-ms/-/humanize-ms-1.2.1.tgz",
|
3224 |
+
"integrity": "sha512-Fl70vYtsAFb/C06PTS9dZBo7ihau+Tu/DNCk/OyHhea07S+aeMWpFFkUaXRa8fI+ScZbEI8dfSxwY7gxZ9SAVQ==",
|
3225 |
+
"optional": true,
|
3226 |
+
"dependencies": {
|
3227 |
+
"ms": "^2.0.0"
|
3228 |
+
}
|
3229 |
+
},
|
3230 |
"node_modules/iconv-lite": {
|
3231 |
"version": "0.6.3",
|
3232 |
"resolved": "https://registry.npmjs.org/iconv-lite/-/iconv-lite-0.6.3.tgz",
|
|
|
3336 |
"node": ">=8"
|
3337 |
}
|
3338 |
},
|
3339 |
+
"node_modules/is-buffer": {
|
3340 |
+
"version": "1.1.6",
|
3341 |
+
"resolved": "https://registry.npmjs.org/is-buffer/-/is-buffer-1.1.6.tgz",
|
3342 |
+
"integrity": "sha512-NcdALwpXkTm5Zvvbk7owOUSvVvBKDgKP5/ewfXEznmQFfs4ZRmanOeKBTjRVjka3QFoN6XJ+9F3USqfHqTaU5w==",
|
3343 |
+
"optional": true
|
3344 |
+
},
|
3345 |
"node_modules/is-builtin-module": {
|
3346 |
"version": "3.2.1",
|
3347 |
"resolved": "https://registry.npmjs.org/is-builtin-module/-/is-builtin-module-3.2.1.tgz",
|
|
|
3777 |
"marked": ">=4 <10"
|
3778 |
}
|
3779 |
},
|
3780 |
+
"node_modules/md5": {
|
3781 |
+
"version": "2.3.0",
|
3782 |
+
"resolved": "https://registry.npmjs.org/md5/-/md5-2.3.0.tgz",
|
3783 |
+
"integrity": "sha512-T1GITYmFaKuO91vxyoQMFETst+O71VUPEU3ze5GNzDm0OWdP8v1ziTaAEPUr/3kLsY3Sftgz242A1SetQiDL7g==",
|
3784 |
+
"optional": true,
|
3785 |
+
"dependencies": {
|
3786 |
+
"charenc": "0.0.2",
|
3787 |
+
"crypt": "0.0.2",
|
3788 |
+
"is-buffer": "~1.1.6"
|
3789 |
+
}
|
3790 |
+
},
|
3791 |
"node_modules/md5-hex": {
|
3792 |
"version": "3.0.1",
|
3793 |
"resolved": "https://registry.npmjs.org/md5-hex/-/md5-hex-3.0.1.tgz",
|
|
|
4065 |
"resolved": "https://registry.npmjs.org/node-addon-api/-/node-addon-api-6.1.0.tgz",
|
4066 |
"integrity": "sha512-+eawOlIgy680F0kBzPUNFhMZGtJ1YmqM6l4+Crf4IkImjYrO/mqPwRMh352g23uIaQKFItcQ64I7KMaJxHgAVA=="
|
4067 |
},
|
4068 |
+
"node_modules/node-domexception": {
|
4069 |
+
"version": "1.0.0",
|
4070 |
+
"resolved": "https://registry.npmjs.org/node-domexception/-/node-domexception-1.0.0.tgz",
|
4071 |
+
"integrity": "sha512-/jKZoMpw0F8GRwl4/eLROPA3cfcXtLApP0QzLmUT/HuPCZWyB7IY9ZrMeKw2O/nFIqPQB3PVM9aYm0F312AXDQ==",
|
4072 |
+
"funding": [
|
4073 |
+
{
|
4074 |
+
"type": "github",
|
4075 |
+
"url": "https://github.com/sponsors/jimmywarting"
|
4076 |
+
},
|
4077 |
+
{
|
4078 |
+
"type": "github",
|
4079 |
+
"url": "https://paypal.me/jimmywarting"
|
4080 |
+
}
|
4081 |
+
],
|
4082 |
+
"optional": true,
|
4083 |
+
"engines": {
|
4084 |
+
"node": ">=10.5.0"
|
4085 |
+
}
|
4086 |
+
},
|
4087 |
+
"node_modules/node-fetch": {
|
4088 |
+
"version": "2.7.0",
|
4089 |
+
"resolved": "https://registry.npmjs.org/node-fetch/-/node-fetch-2.7.0.tgz",
|
4090 |
+
"integrity": "sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A==",
|
4091 |
+
"optional": true,
|
4092 |
+
"dependencies": {
|
4093 |
+
"whatwg-url": "^5.0.0"
|
4094 |
+
},
|
4095 |
+
"engines": {
|
4096 |
+
"node": "4.x || >=6.0.0"
|
4097 |
+
},
|
4098 |
+
"peerDependencies": {
|
4099 |
+
"encoding": "^0.1.0"
|
4100 |
+
},
|
4101 |
+
"peerDependenciesMeta": {
|
4102 |
+
"encoding": {
|
4103 |
+
"optional": true
|
4104 |
+
}
|
4105 |
+
}
|
4106 |
+
},
|
4107 |
+
"node_modules/node-fetch/node_modules/tr46": {
|
4108 |
+
"version": "0.0.3",
|
4109 |
+
"resolved": "https://registry.npmjs.org/tr46/-/tr46-0.0.3.tgz",
|
4110 |
+
"integrity": "sha512-N3WMsuqV66lT30CrXNbEjx4GEwlow3v6rr4mCcv6prnfwhS01rkgyFdjPNBYd9br7LpXV1+Emh01fHnq2Gdgrw==",
|
4111 |
+
"optional": true
|
4112 |
+
},
|
4113 |
+
"node_modules/node-fetch/node_modules/webidl-conversions": {
|
4114 |
+
"version": "3.0.1",
|
4115 |
+
"resolved": "https://registry.npmjs.org/webidl-conversions/-/webidl-conversions-3.0.1.tgz",
|
4116 |
+
"integrity": "sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ==",
|
4117 |
+
"optional": true
|
4118 |
+
},
|
4119 |
+
"node_modules/node-fetch/node_modules/whatwg-url": {
|
4120 |
+
"version": "5.0.0",
|
4121 |
+
"resolved": "https://registry.npmjs.org/whatwg-url/-/whatwg-url-5.0.0.tgz",
|
4122 |
+
"integrity": "sha512-saE57nupxk6v3HY35+jzBwYa0rKSy0XR8JSxZPwgLr7ys0IBzhGviA1/TUGJLmSVqs8pb9AnvICXEuOHLprYTw==",
|
4123 |
+
"optional": true,
|
4124 |
+
"dependencies": {
|
4125 |
+
"tr46": "~0.0.3",
|
4126 |
+
"webidl-conversions": "^3.0.0"
|
4127 |
+
}
|
4128 |
+
},
|
4129 |
"node_modules/node-gyp-build": {
|
4130 |
"version": "4.6.1",
|
4131 |
"resolved": "https://registry.npmjs.org/node-gyp-build/-/node-gyp-build-4.6.1.tgz",
|
|
|
4276 |
"platform": "^1.3.6"
|
4277 |
}
|
4278 |
},
|
4279 |
+
"node_modules/openai": {
|
4280 |
+
"version": "4.14.2",
|
4281 |
+
"resolved": "https://registry.npmjs.org/openai/-/openai-4.14.2.tgz",
|
4282 |
+
"integrity": "sha512-JGlm7mMC7J+cyQZnQMOH7daD9cBqqWqLtlBsejElEkgoehPrYfdyxSxIGICz5xk4YimbwI5FlLATSVojLtCKXQ==",
|
4283 |
+
"optional": true,
|
4284 |
+
"dependencies": {
|
4285 |
+
"@types/node": "^18.11.18",
|
4286 |
+
"@types/node-fetch": "^2.6.4",
|
4287 |
+
"abort-controller": "^3.0.0",
|
4288 |
+
"agentkeepalive": "^4.2.1",
|
4289 |
+
"digest-fetch": "^1.3.0",
|
4290 |
+
"form-data-encoder": "1.7.2",
|
4291 |
+
"formdata-node": "^4.3.2",
|
4292 |
+
"node-fetch": "^2.6.7",
|
4293 |
+
"web-streams-polyfill": "^3.2.1"
|
4294 |
+
},
|
4295 |
+
"bin": {
|
4296 |
+
"openai": "bin/cli"
|
4297 |
+
}
|
4298 |
+
},
|
4299 |
+
"node_modules/openai/node_modules/web-streams-polyfill": {
|
4300 |
+
"version": "3.2.1",
|
4301 |
+
"resolved": "https://registry.npmjs.org/web-streams-polyfill/-/web-streams-polyfill-3.2.1.tgz",
|
4302 |
+
"integrity": "sha512-e0MO3wdXWKrLbL0DgGnUV7WHVuw9OUvL4hjgnPkIeEvESk74gAITi5G606JtZPp39cd8HA9VQzCIvA49LpPN5Q==",
|
4303 |
+
"optional": true,
|
4304 |
+
"engines": {
|
4305 |
+
"node": ">= 8"
|
4306 |
+
}
|
4307 |
+
},
|
4308 |
"node_modules/openid-client": {
|
4309 |
"version": "5.4.2",
|
4310 |
"resolved": "https://registry.npmjs.org/openid-client/-/openid-client-5.4.2.tgz",
|
|
|
6476 |
"node": ">=14"
|
6477 |
}
|
6478 |
},
|
6479 |
+
"node_modules/web-streams-polyfill": {
|
6480 |
+
"version": "4.0.0-beta.3",
|
6481 |
+
"resolved": "https://registry.npmjs.org/web-streams-polyfill/-/web-streams-polyfill-4.0.0-beta.3.tgz",
|
6482 |
+
"integrity": "sha512-QW95TCTaHmsYfHDybGMwO5IJIM93I/6vTRk+daHTWFPhwh+C8Cg7j7XyKrwrj8Ib6vYXe0ocYNrmzY4xAAN6ug==",
|
6483 |
+
"optional": true,
|
6484 |
+
"engines": {
|
6485 |
+
"node": ">= 14"
|
6486 |
+
}
|
6487 |
+
},
|
6488 |
"node_modules/webidl-conversions": {
|
6489 |
"version": "7.0.0",
|
6490 |
"resolved": "https://registry.npmjs.org/webidl-conversions/-/webidl-conversions-7.0.0.tgz",
|
@@ -48,7 +48,6 @@
|
|
48 |
"@huggingface/inference": "^2.6.3",
|
49 |
"@xenova/transformers": "^2.6.0",
|
50 |
"autoprefixer": "^10.4.14",
|
51 |
-
"aws4fetch": "^1.0.17",
|
52 |
"date-fns": "^2.29.3",
|
53 |
"dotenv": "^16.0.3",
|
54 |
"handlebars": "^4.7.8",
|
@@ -64,5 +63,9 @@
|
|
64 |
"tailwind-scrollbar": "^3.0.0",
|
65 |
"tailwindcss": "^3.3.1",
|
66 |
"zod": "^3.22.3"
|
|
|
|
|
|
|
|
|
67 |
}
|
68 |
}
|
|
|
48 |
"@huggingface/inference": "^2.6.3",
|
49 |
"@xenova/transformers": "^2.6.0",
|
50 |
"autoprefixer": "^10.4.14",
|
|
|
51 |
"date-fns": "^2.29.3",
|
52 |
"dotenv": "^16.0.3",
|
53 |
"handlebars": "^4.7.8",
|
|
|
63 |
"tailwind-scrollbar": "^3.0.0",
|
64 |
"tailwindcss": "^3.3.1",
|
65 |
"zod": "^3.22.3"
|
66 |
+
},
|
67 |
+
"optionalDependencies": {
|
68 |
+
"aws4fetch": "^1.0.17",
|
69 |
+
"openai": "^4.14.2"
|
70 |
}
|
71 |
}
|
@@ -0,0 +1,64 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import { buildPrompt } from "$lib/buildPrompt";
|
2 |
+
import { textGenerationStream } from "@huggingface/inference";
|
3 |
+
import { z } from "zod";
|
4 |
+
import type { Endpoint } from "../endpoints";
|
5 |
+
|
6 |
+
export const endpointAwsParametersSchema = z.object({
|
7 |
+
weight: z.number().int().positive().default(1),
|
8 |
+
model: z.any(),
|
9 |
+
type: z.literal("aws"),
|
10 |
+
url: z.string().url(),
|
11 |
+
accessKey: z.string().min(1),
|
12 |
+
secretKey: z.string().min(1),
|
13 |
+
sessionToken: z.string().optional(),
|
14 |
+
service: z.union([z.literal("sagemaker"), z.literal("lambda")]).default("sagemaker"),
|
15 |
+
region: z.string().optional(),
|
16 |
+
});
|
17 |
+
|
18 |
+
export async function endpointAws({
|
19 |
+
url,
|
20 |
+
accessKey,
|
21 |
+
secretKey,
|
22 |
+
sessionToken,
|
23 |
+
model,
|
24 |
+
region,
|
25 |
+
service,
|
26 |
+
}: z.infer<typeof endpointAwsParametersSchema>): Promise<Endpoint> {
|
27 |
+
let AwsClient;
|
28 |
+
try {
|
29 |
+
AwsClient = (await import("aws4fetch")).AwsClient;
|
30 |
+
} catch (e) {
|
31 |
+
throw new Error("Failed to import aws4fetch");
|
32 |
+
}
|
33 |
+
|
34 |
+
const aws = new AwsClient({
|
35 |
+
accessKeyId: accessKey,
|
36 |
+
secretAccessKey: secretKey,
|
37 |
+
sessionToken,
|
38 |
+
service,
|
39 |
+
region,
|
40 |
+
});
|
41 |
+
|
42 |
+
return async ({ conversation }) => {
|
43 |
+
const prompt = await buildPrompt({
|
44 |
+
messages: conversation.messages,
|
45 |
+
webSearch: conversation.messages[conversation.messages.length - 1].webSearch,
|
46 |
+
preprompt: conversation.preprompt,
|
47 |
+
model,
|
48 |
+
});
|
49 |
+
|
50 |
+
return textGenerationStream(
|
51 |
+
{
|
52 |
+
parameters: { ...model.parameters, return_full_text: false },
|
53 |
+
model: url,
|
54 |
+
inputs: prompt,
|
55 |
+
},
|
56 |
+
{
|
57 |
+
use_cache: false,
|
58 |
+
fetch: aws.fetch.bind(aws) as typeof fetch,
|
59 |
+
}
|
60 |
+
);
|
61 |
+
};
|
62 |
+
}
|
63 |
+
|
64 |
+
export default endpointAws;
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import type { Conversation } from "$lib/types/Conversation";
|
2 |
+
import type { TextGenerationStreamOutput } from "@huggingface/inference";
|
3 |
+
import { endpointTgi, endpointTgiParametersSchema } from "./tgi/endpointTgi";
|
4 |
+
import { z } from "zod";
|
5 |
+
import endpointAws, { endpointAwsParametersSchema } from "./aws/endpointAws";
|
6 |
+
import { endpointOAIParametersSchema, endpointOai } from "./openai/endpointOai";
|
7 |
+
import endpointLlamacpp, { endpointLlamacppParametersSchema } from "./llamacpp/endpointLlamacpp";
|
8 |
+
|
9 |
+
// parameters passed when generating text
|
10 |
+
interface EndpointParameters {
|
11 |
+
conversation: {
|
12 |
+
messages: Omit<Conversation["messages"][0], "id">[];
|
13 |
+
preprompt?: Conversation["preprompt"];
|
14 |
+
};
|
15 |
+
}
|
16 |
+
|
17 |
+
interface CommonEndpoint {
|
18 |
+
weight: number;
|
19 |
+
}
|
20 |
+
// type signature for the endpoint
|
21 |
+
export type Endpoint = (
|
22 |
+
params: EndpointParameters
|
23 |
+
) => Promise<AsyncGenerator<TextGenerationStreamOutput, void, void>>;
|
24 |
+
|
25 |
+
// generator function that takes in parameters for defining the endpoint and return the endpoint
|
26 |
+
export type EndpointGenerator<T extends CommonEndpoint> = (parameters: T) => Endpoint;
|
27 |
+
|
28 |
+
// list of all endpoint generators
|
29 |
+
export const endpoints = {
|
30 |
+
tgi: endpointTgi,
|
31 |
+
sagemaker: endpointAws,
|
32 |
+
openai: endpointOai,
|
33 |
+
llamacpp: endpointLlamacpp,
|
34 |
+
};
|
35 |
+
|
36 |
+
export const endpointSchema = z.discriminatedUnion("type", [
|
37 |
+
endpointAwsParametersSchema,
|
38 |
+
endpointOAIParametersSchema,
|
39 |
+
endpointTgiParametersSchema,
|
40 |
+
endpointLlamacppParametersSchema,
|
41 |
+
]);
|
42 |
+
export default endpoints;
|
@@ -0,0 +1,100 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import { HF_ACCESS_TOKEN } from "$env/static/private";
|
2 |
+
import { buildPrompt } from "$lib/buildPrompt";
|
3 |
+
import type { TextGenerationStreamOutput } from "@huggingface/inference";
|
4 |
+
import type { Endpoint } from "../endpoints";
|
5 |
+
import { z } from "zod";
|
6 |
+
|
7 |
+
export const endpointLlamacppParametersSchema = z.object({
|
8 |
+
weight: z.number().int().positive().default(1),
|
9 |
+
model: z.any(),
|
10 |
+
type: z.literal("llamacpp"),
|
11 |
+
url: z.string().url(),
|
12 |
+
accessToken: z.string().min(1).default(HF_ACCESS_TOKEN),
|
13 |
+
});
|
14 |
+
|
15 |
+
export function endpointLlamacpp({
|
16 |
+
url,
|
17 |
+
model,
|
18 |
+
}: z.infer<typeof endpointLlamacppParametersSchema>): Endpoint {
|
19 |
+
return async ({ conversation }) => {
|
20 |
+
const prompt = await buildPrompt({
|
21 |
+
messages: conversation.messages,
|
22 |
+
webSearch: conversation.messages[conversation.messages.length - 1].webSearch,
|
23 |
+
preprompt: conversation.preprompt,
|
24 |
+
model,
|
25 |
+
});
|
26 |
+
|
27 |
+
const r = await fetch(`${url}/completion`, {
|
28 |
+
method: "POST",
|
29 |
+
headers: {
|
30 |
+
"Content-Type": "application/json",
|
31 |
+
},
|
32 |
+
body: JSON.stringify({
|
33 |
+
prompt,
|
34 |
+
stream: true,
|
35 |
+
temperature: model.parameters.temperature,
|
36 |
+
top_p: model.parameters.top_p,
|
37 |
+
top_k: model.parameters.top_k,
|
38 |
+
stop: model.parameters.stop,
|
39 |
+
repeat_penalty: model.parameters.repetition_penalty,
|
40 |
+
n_predict: model.parameters.max_new_tokens,
|
41 |
+
}),
|
42 |
+
});
|
43 |
+
|
44 |
+
if (!r.ok) {
|
45 |
+
throw new Error(`Failed to generate text: ${await r.text()}`);
|
46 |
+
}
|
47 |
+
|
48 |
+
const encoder = new TextDecoderStream();
|
49 |
+
const reader = r.body?.pipeThrough(encoder).getReader();
|
50 |
+
|
51 |
+
return (async function* () {
|
52 |
+
let stop = false;
|
53 |
+
let generatedText = "";
|
54 |
+
let tokenId = 0;
|
55 |
+
while (!stop) {
|
56 |
+
// read the stream and log the outputs to console
|
57 |
+
const out = (await reader?.read()) ?? { done: false, value: undefined };
|
58 |
+
// we read, if it's done we cancel
|
59 |
+
if (out.done) {
|
60 |
+
reader?.cancel();
|
61 |
+
return;
|
62 |
+
}
|
63 |
+
|
64 |
+
if (!out.value) {
|
65 |
+
return;
|
66 |
+
}
|
67 |
+
|
68 |
+
if (out.value.startsWith("data: ")) {
|
69 |
+
let data = null;
|
70 |
+
try {
|
71 |
+
data = JSON.parse(out.value.slice(6));
|
72 |
+
} catch (e) {
|
73 |
+
return;
|
74 |
+
}
|
75 |
+
if (data.content || data.stop) {
|
76 |
+
generatedText += data.content;
|
77 |
+
const output: TextGenerationStreamOutput = {
|
78 |
+
token: {
|
79 |
+
id: tokenId++,
|
80 |
+
text: data.content ?? "",
|
81 |
+
logprob: 0,
|
82 |
+
special: false,
|
83 |
+
},
|
84 |
+
generated_text: data.stop ? generatedText : null,
|
85 |
+
details: null,
|
86 |
+
};
|
87 |
+
if (data.stop) {
|
88 |
+
stop = true;
|
89 |
+
reader?.cancel();
|
90 |
+
}
|
91 |
+
yield output;
|
92 |
+
// take the data.content value and yield it
|
93 |
+
}
|
94 |
+
}
|
95 |
+
}
|
96 |
+
})();
|
97 |
+
};
|
98 |
+
}
|
99 |
+
|
100 |
+
export default endpointLlamacpp;
|
@@ -0,0 +1,82 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import { z } from "zod";
|
2 |
+
import { openAICompletionToTextGenerationStream } from "./openAICompletionToTextGenerationStream";
|
3 |
+
import { openAIChatToTextGenerationStream } from "./openAIChatToTextGenerationStream";
|
4 |
+
import { buildPrompt } from "$lib/buildPrompt";
|
5 |
+
import { OPENAI_API_KEY } from "$env/static/private";
|
6 |
+
import type { Endpoint } from "../endpoints";
|
7 |
+
|
8 |
+
export const endpointOAIParametersSchema = z.object({
|
9 |
+
weight: z.number().int().positive().default(1),
|
10 |
+
model: z.any(),
|
11 |
+
type: z.literal("openai"),
|
12 |
+
baseURL: z.string().url().default("https://api.openai.com/v1"),
|
13 |
+
apiKey: z.string().default(OPENAI_API_KEY ?? "sk-"),
|
14 |
+
completion: z
|
15 |
+
.union([z.literal("completions"), z.literal("chat_completions")])
|
16 |
+
.default("chat_completions"),
|
17 |
+
});
|
18 |
+
|
19 |
+
export async function endpointOai({
|
20 |
+
baseURL,
|
21 |
+
apiKey,
|
22 |
+
completion,
|
23 |
+
model,
|
24 |
+
}: z.infer<typeof endpointOAIParametersSchema>): Promise<Endpoint> {
|
25 |
+
let OpenAI;
|
26 |
+
try {
|
27 |
+
OpenAI = (await import("openai")).OpenAI;
|
28 |
+
} catch (e) {
|
29 |
+
throw new Error("Failed to import OpenAI", { cause: e });
|
30 |
+
}
|
31 |
+
|
32 |
+
const openai = new OpenAI({
|
33 |
+
apiKey: apiKey ?? "sk-",
|
34 |
+
baseURL: baseURL,
|
35 |
+
});
|
36 |
+
|
37 |
+
if (completion === "completions") {
|
38 |
+
return async ({ conversation }) => {
|
39 |
+
return openAICompletionToTextGenerationStream(
|
40 |
+
await openai.completions.create({
|
41 |
+
model: model.id ?? model.name,
|
42 |
+
prompt: await buildPrompt({
|
43 |
+
messages: conversation.messages,
|
44 |
+
webSearch: conversation.messages[conversation.messages.length - 1].webSearch,
|
45 |
+
preprompt: conversation.preprompt,
|
46 |
+
model,
|
47 |
+
}),
|
48 |
+
stream: true,
|
49 |
+
max_tokens: model.parameters?.max_new_tokens,
|
50 |
+
stop: model.parameters?.stop,
|
51 |
+
temperature: model.parameters?.temperature,
|
52 |
+
top_p: model.parameters?.top_p,
|
53 |
+
frequency_penalty: model.parameters?.repetition_penalty,
|
54 |
+
})
|
55 |
+
);
|
56 |
+
};
|
57 |
+
} else if (completion === "chat_completions") {
|
58 |
+
return async ({ conversation }) => {
|
59 |
+
const messages = conversation.messages.map((message) => ({
|
60 |
+
role: message.from,
|
61 |
+
content: message.content,
|
62 |
+
}));
|
63 |
+
|
64 |
+
return openAIChatToTextGenerationStream(
|
65 |
+
await openai.chat.completions.create({
|
66 |
+
model: model.id ?? model.name,
|
67 |
+
messages: conversation.preprompt
|
68 |
+
? [{ role: "system", content: conversation.preprompt }, ...messages]
|
69 |
+
: messages,
|
70 |
+
stream: true,
|
71 |
+
max_tokens: model.parameters?.max_new_tokens,
|
72 |
+
stop: model.parameters?.stop,
|
73 |
+
temperature: model.parameters?.temperature,
|
74 |
+
top_p: model.parameters?.top_p,
|
75 |
+
frequency_penalty: model.parameters?.repetition_penalty,
|
76 |
+
})
|
77 |
+
);
|
78 |
+
};
|
79 |
+
} else {
|
80 |
+
throw new Error("Invalid completion type");
|
81 |
+
}
|
82 |
+
}
|
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import type { TextGenerationStreamOutput } from "@huggingface/inference";
|
2 |
+
import type OpenAI from "openai";
|
3 |
+
import type { Stream } from "openai/streaming";
|
4 |
+
|
5 |
+
/**
|
6 |
+
* Transform a stream of OpenAI.Chat.ChatCompletion into a stream of TextGenerationStreamOutput
|
7 |
+
*/
|
8 |
+
export async function* openAIChatToTextGenerationStream(
|
9 |
+
completionStream: Stream<OpenAI.Chat.Completions.ChatCompletionChunk>
|
10 |
+
) {
|
11 |
+
let generatedText = "";
|
12 |
+
let tokenId = 0;
|
13 |
+
for await (const completion of completionStream) {
|
14 |
+
const { choices } = completion;
|
15 |
+
const content = choices[0]?.delta?.content ?? "";
|
16 |
+
const last = choices[0]?.finish_reason === "stop";
|
17 |
+
if (content) {
|
18 |
+
generatedText = generatedText + content;
|
19 |
+
}
|
20 |
+
const output: TextGenerationStreamOutput = {
|
21 |
+
token: {
|
22 |
+
id: tokenId++,
|
23 |
+
text: content ?? "",
|
24 |
+
logprob: 0,
|
25 |
+
special: false,
|
26 |
+
},
|
27 |
+
generated_text: last ? generatedText : null,
|
28 |
+
details: null,
|
29 |
+
};
|
30 |
+
yield output;
|
31 |
+
}
|
32 |
+
}
|
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import type { TextGenerationStreamOutput } from "@huggingface/inference";
|
2 |
+
import type OpenAI from "openai";
|
3 |
+
import type { Stream } from "openai/streaming";
|
4 |
+
|
5 |
+
/**
|
6 |
+
* Transform a stream of OpenAI.Completions.Completion into a stream of TextGenerationStreamOutput
|
7 |
+
*/
|
8 |
+
export async function* openAICompletionToTextGenerationStream(
|
9 |
+
completionStream: Stream<OpenAI.Completions.Completion>
|
10 |
+
) {
|
11 |
+
let generatedText = "";
|
12 |
+
let tokenId = 0;
|
13 |
+
for await (const completion of completionStream) {
|
14 |
+
const { choices } = completion;
|
15 |
+
const text = choices[0]?.text ?? "";
|
16 |
+
const last = choices[0]?.finish_reason === "stop";
|
17 |
+
if (text) {
|
18 |
+
generatedText = generatedText + text;
|
19 |
+
}
|
20 |
+
const output: TextGenerationStreamOutput = {
|
21 |
+
token: {
|
22 |
+
id: tokenId++,
|
23 |
+
text,
|
24 |
+
logprob: 0,
|
25 |
+
special: false,
|
26 |
+
},
|
27 |
+
generated_text: last ? generatedText : null,
|
28 |
+
details: null,
|
29 |
+
};
|
30 |
+
yield output;
|
31 |
+
}
|
32 |
+
}
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import { HF_ACCESS_TOKEN } from "$env/static/private";
|
2 |
+
import { buildPrompt } from "$lib/buildPrompt";
|
3 |
+
import { textGenerationStream } from "@huggingface/inference";
|
4 |
+
import type { Endpoint } from "../endpoints";
|
5 |
+
import { z } from "zod";
|
6 |
+
|
7 |
+
export const endpointTgiParametersSchema = z.object({
|
8 |
+
weight: z.number().int().positive().default(1),
|
9 |
+
model: z.any(),
|
10 |
+
type: z.literal("tgi"),
|
11 |
+
url: z.string().url(),
|
12 |
+
accessToken: z.string().min(1).default(HF_ACCESS_TOKEN),
|
13 |
+
});
|
14 |
+
|
15 |
+
export function endpointTgi({
|
16 |
+
url,
|
17 |
+
accessToken,
|
18 |
+
model,
|
19 |
+
}: z.infer<typeof endpointTgiParametersSchema>): Endpoint {
|
20 |
+
return async ({ conversation }) => {
|
21 |
+
const prompt = await buildPrompt({
|
22 |
+
messages: conversation.messages,
|
23 |
+
webSearch: conversation.messages[conversation.messages.length - 1].webSearch,
|
24 |
+
preprompt: conversation.preprompt,
|
25 |
+
model,
|
26 |
+
});
|
27 |
+
|
28 |
+
return textGenerationStream({
|
29 |
+
parameters: { ...model.parameters, return_full_text: false },
|
30 |
+
model: url,
|
31 |
+
inputs: prompt,
|
32 |
+
accessToken,
|
33 |
+
});
|
34 |
+
};
|
35 |
+
}
|
36 |
+
|
37 |
+
export default endpointTgi;
|
@@ -1,110 +1,28 @@
|
|
1 |
import { smallModel } from "$lib/server/models";
|
2 |
-
import {
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
}
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
)
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
const randomEndpoint = modelEndpoint(smallModel);
|
26 |
-
|
27 |
-
const abortController = new AbortController();
|
28 |
-
|
29 |
-
let resp: Response;
|
30 |
-
|
31 |
-
if (randomEndpoint.host === "sagemaker") {
|
32 |
-
const requestParams = JSON.stringify({
|
33 |
-
parameters: newParameters,
|
34 |
-
inputs: prompt,
|
35 |
-
});
|
36 |
-
|
37 |
-
const aws = new AwsClient({
|
38 |
-
accessKeyId: randomEndpoint.accessKey,
|
39 |
-
secretAccessKey: randomEndpoint.secretKey,
|
40 |
-
sessionToken: randomEndpoint.sessionToken,
|
41 |
-
service: "sagemaker",
|
42 |
-
});
|
43 |
-
|
44 |
-
resp = await aws.fetch(randomEndpoint.url, {
|
45 |
-
method: "POST",
|
46 |
-
body: requestParams,
|
47 |
-
signal: abortController.signal,
|
48 |
-
headers: {
|
49 |
-
"Content-Type": "application/json",
|
50 |
-
},
|
51 |
-
});
|
52 |
-
} else {
|
53 |
-
resp = await fetch(randomEndpoint.url, {
|
54 |
-
headers: {
|
55 |
-
"Content-Type": "application/json",
|
56 |
-
Authorization: randomEndpoint.authorization,
|
57 |
-
},
|
58 |
-
method: "POST",
|
59 |
-
body: JSON.stringify({
|
60 |
-
parameters: newParameters,
|
61 |
-
inputs: prompt,
|
62 |
-
}),
|
63 |
-
signal: abortController.signal,
|
64 |
-
});
|
65 |
-
}
|
66 |
-
|
67 |
-
if (!resp.ok) {
|
68 |
-
throw new Error(await resp.text());
|
69 |
-
}
|
70 |
-
|
71 |
-
if (!resp.body) {
|
72 |
-
throw new Error("Body is empty");
|
73 |
-
}
|
74 |
-
|
75 |
-
const decoder = new TextDecoder();
|
76 |
-
const reader = resp.body.getReader();
|
77 |
-
|
78 |
-
let isDone = false;
|
79 |
-
let result = "";
|
80 |
-
|
81 |
-
while (!isDone) {
|
82 |
-
const { done, value } = await reader.read();
|
83 |
-
|
84 |
-
isDone = done;
|
85 |
-
result += decoder.decode(value, { stream: true }); // Convert current chunk to text
|
86 |
-
}
|
87 |
-
|
88 |
-
// Close the reader when done
|
89 |
-
reader.releaseLock();
|
90 |
-
|
91 |
-
let results;
|
92 |
-
if (result.startsWith("data:")) {
|
93 |
-
results = [JSON.parse(result.split("data:")?.pop() ?? "")];
|
94 |
-
} else {
|
95 |
-
results = JSON.parse(result);
|
96 |
-
}
|
97 |
-
|
98 |
-
let generated_text = trimSuffix(
|
99 |
-
trimPrefix(trimPrefix(results[0].generated_text, "<|startoftext|>"), prompt),
|
100 |
-
PUBLIC_SEP_TOKEN
|
101 |
-
).trimEnd();
|
102 |
-
|
103 |
-
for (const stop of [...(newParameters?.stop ?? []), "<|endoftext|>"]) {
|
104 |
-
if (generated_text.endsWith(stop)) {
|
105 |
-
generated_text = generated_text.slice(0, -stop.length).trimEnd();
|
106 |
}
|
107 |
}
|
108 |
-
|
109 |
-
return generated_text;
|
110 |
}
|
|
|
1 |
import { smallModel } from "$lib/server/models";
|
2 |
+
import type { Conversation } from "$lib/types/Conversation";
|
3 |
+
|
4 |
+
export async function generateFromDefaultEndpoint({
|
5 |
+
messages,
|
6 |
+
preprompt,
|
7 |
+
}: {
|
8 |
+
messages: Omit<Conversation["messages"][0], "id">[];
|
9 |
+
preprompt?: string;
|
10 |
+
}): Promise<string> {
|
11 |
+
const endpoint = await smallModel.getEndpoint();
|
12 |
+
|
13 |
+
const tokenStream = await endpoint({ conversation: { messages, preprompt } });
|
14 |
+
|
15 |
+
for await (const output of tokenStream) {
|
16 |
+
// if not generated_text is here it means the generation is not done
|
17 |
+
if (output.generated_text) {
|
18 |
+
let generated_text = output.generated_text;
|
19 |
+
for (const stop of [...(smallModel.parameters?.stop ?? []), "<|endoftext|>"]) {
|
20 |
+
if (generated_text.endsWith(stop)) {
|
21 |
+
generated_text = generated_text.slice(0, -stop.length).trimEnd();
|
22 |
+
}
|
23 |
+
}
|
24 |
+
return generated_text;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
}
|
26 |
}
|
27 |
+
throw new Error("Generation failed");
|
|
|
28 |
}
|
@@ -1,50 +0,0 @@
|
|
1 |
-
import {
|
2 |
-
HF_ACCESS_TOKEN,
|
3 |
-
HF_API_ROOT,
|
4 |
-
USE_CLIENT_CERTIFICATE,
|
5 |
-
CERT_PATH,
|
6 |
-
KEY_PATH,
|
7 |
-
CA_PATH,
|
8 |
-
CLIENT_KEY_PASSWORD,
|
9 |
-
REJECT_UNAUTHORIZED,
|
10 |
-
} from "$env/static/private";
|
11 |
-
import { sum } from "$lib/utils/sum";
|
12 |
-
import type { BackendModel, Endpoint } from "./models";
|
13 |
-
|
14 |
-
import { loadClientCertificates } from "$lib/utils/loadClientCerts";
|
15 |
-
|
16 |
-
if (USE_CLIENT_CERTIFICATE === "true") {
|
17 |
-
loadClientCertificates(
|
18 |
-
CERT_PATH,
|
19 |
-
KEY_PATH,
|
20 |
-
CA_PATH,
|
21 |
-
CLIENT_KEY_PASSWORD,
|
22 |
-
REJECT_UNAUTHORIZED === "true"
|
23 |
-
);
|
24 |
-
}
|
25 |
-
|
26 |
-
/**
|
27 |
-
* Find a random load-balanced endpoint
|
28 |
-
*/
|
29 |
-
export function modelEndpoint(model: BackendModel): Endpoint {
|
30 |
-
if (!model.endpoints) {
|
31 |
-
return {
|
32 |
-
host: "tgi",
|
33 |
-
url: `${HF_API_ROOT}/${model.name}`,
|
34 |
-
authorization: `Bearer ${HF_ACCESS_TOKEN}`,
|
35 |
-
weight: 1,
|
36 |
-
};
|
37 |
-
}
|
38 |
-
const endpoints = model.endpoints;
|
39 |
-
const totalWeight = sum(endpoints.map((e) => e.weight));
|
40 |
-
|
41 |
-
let random = Math.random() * totalWeight;
|
42 |
-
for (const endpoint of endpoints) {
|
43 |
-
if (random < endpoint.weight) {
|
44 |
-
return endpoint;
|
45 |
-
}
|
46 |
-
random -= endpoint.weight;
|
47 |
-
}
|
48 |
-
|
49 |
-
throw new Error("Invalid config, no endpoint found");
|
50 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,42 +1,13 @@
|
|
1 |
-
import { HF_ACCESS_TOKEN, MODELS, OLD_MODELS, TASK_MODEL } from "$env/static/private";
|
2 |
import type { ChatTemplateInput } from "$lib/types/Template";
|
3 |
import { compileTemplate } from "$lib/utils/template";
|
4 |
import { z } from "zod";
|
|
|
|
|
|
|
5 |
|
6 |
type Optional<T, K extends keyof T> = Pick<Partial<T>, K> & Omit<T, K>;
|
7 |
|
8 |
-
const sagemakerEndpoint = z.object({
|
9 |
-
host: z.literal("sagemaker"),
|
10 |
-
url: z.string().url(),
|
11 |
-
accessKey: z.string().min(1),
|
12 |
-
secretKey: z.string().min(1),
|
13 |
-
sessionToken: z.string().optional(),
|
14 |
-
});
|
15 |
-
|
16 |
-
const tgiEndpoint = z.object({
|
17 |
-
host: z.union([z.literal("tgi"), z.undefined()]),
|
18 |
-
url: z.string().url(),
|
19 |
-
authorization: z.string().min(1).default(`Bearer ${HF_ACCESS_TOKEN}`),
|
20 |
-
});
|
21 |
-
|
22 |
-
const commonEndpoint = z.object({
|
23 |
-
weight: z.number().int().positive().default(1),
|
24 |
-
});
|
25 |
-
|
26 |
-
const endpoint = z.lazy(() =>
|
27 |
-
z.union([sagemakerEndpoint.merge(commonEndpoint), tgiEndpoint.merge(commonEndpoint)])
|
28 |
-
);
|
29 |
-
|
30 |
-
const combinedEndpoint = endpoint.transform((data) => {
|
31 |
-
if (data.host === "tgi" || data.host === undefined) {
|
32 |
-
return tgiEndpoint.merge(commonEndpoint).parse(data);
|
33 |
-
} else if (data.host === "sagemaker") {
|
34 |
-
return sagemakerEndpoint.merge(commonEndpoint).parse(data);
|
35 |
-
} else {
|
36 |
-
throw new Error(`Invalid host: ${data.host}`);
|
37 |
-
}
|
38 |
-
});
|
39 |
-
|
40 |
const modelConfig = z.object({
|
41 |
/** Used as an identifier in DB */
|
42 |
id: z.string().optional(),
|
@@ -73,13 +44,16 @@ const modelConfig = z.object({
|
|
73 |
})
|
74 |
)
|
75 |
.optional(),
|
76 |
-
endpoints: z.array(
|
77 |
parameters: z
|
78 |
.object({
|
79 |
temperature: z.number().min(0).max(1),
|
80 |
truncate: z.number().int().positive(),
|
81 |
max_new_tokens: z.number().int().positive(),
|
82 |
stop: z.array(z.string()).optional(),
|
|
|
|
|
|
|
83 |
})
|
84 |
.passthrough()
|
85 |
.optional(),
|
@@ -98,7 +72,48 @@ const processModel = async (m: z.infer<typeof modelConfig>) => ({
|
|
98 |
parameters: { ...m.parameters, stop_sequences: m.parameters?.stop },
|
99 |
});
|
100 |
|
101 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
102 |
|
103 |
// Models that have been deprecated
|
104 |
export const oldModels = OLD_MODELS
|
@@ -114,18 +129,19 @@ export const oldModels = OLD_MODELS
|
|
114 |
.map((m) => ({ ...m, id: m.id || m.name, displayName: m.displayName || m.name }))
|
115 |
: [];
|
116 |
|
117 |
-
export const defaultModel = models[0];
|
118 |
-
|
119 |
export const validateModel = (_models: BackendModel[]) => {
|
120 |
// Zod enum function requires 2 parameters
|
121 |
return z.enum([_models[0].id, ..._models.slice(1).map((m) => m.id)]);
|
122 |
};
|
123 |
|
124 |
// if `TASK_MODEL` is the name of a model we use it, else we try to parse `TASK_MODEL` as a model config itself
|
|
|
125 |
export const smallModel = TASK_MODEL
|
126 |
-
? models.find((m) => m.name === TASK_MODEL) ||
|
127 |
-
|
|
|
|
|
|
|
128 |
: defaultModel;
|
129 |
|
130 |
-
export type BackendModel = Optional<
|
131 |
-
export type Endpoint = z.infer<typeof endpoint>;
|
|
|
1 |
+
import { HF_ACCESS_TOKEN, HF_API_ROOT, MODELS, OLD_MODELS, TASK_MODEL } from "$env/static/private";
|
2 |
import type { ChatTemplateInput } from "$lib/types/Template";
|
3 |
import { compileTemplate } from "$lib/utils/template";
|
4 |
import { z } from "zod";
|
5 |
+
import endpoints, { endpointSchema, type Endpoint } from "./endpoints/endpoints";
|
6 |
+
import endpointTgi from "./endpoints/tgi/endpointTgi";
|
7 |
+
import { sum } from "$lib/utils/sum";
|
8 |
|
9 |
type Optional<T, K extends keyof T> = Pick<Partial<T>, K> & Omit<T, K>;
|
10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
const modelConfig = z.object({
|
12 |
/** Used as an identifier in DB */
|
13 |
id: z.string().optional(),
|
|
|
44 |
})
|
45 |
)
|
46 |
.optional(),
|
47 |
+
endpoints: z.array(endpointSchema).optional(),
|
48 |
parameters: z
|
49 |
.object({
|
50 |
temperature: z.number().min(0).max(1),
|
51 |
truncate: z.number().int().positive(),
|
52 |
max_new_tokens: z.number().int().positive(),
|
53 |
stop: z.array(z.string()).optional(),
|
54 |
+
top_p: z.number().positive().optional(),
|
55 |
+
top_k: z.number().positive().optional(),
|
56 |
+
repetition_penalty: z.number().min(-2).max(2).optional(),
|
57 |
})
|
58 |
.passthrough()
|
59 |
.optional(),
|
|
|
72 |
parameters: { ...m.parameters, stop_sequences: m.parameters?.stop },
|
73 |
});
|
74 |
|
75 |
+
const addEndpoint = (m: Awaited<ReturnType<typeof processModel>>) => ({
|
76 |
+
...m,
|
77 |
+
getEndpoint: async (): Promise<Endpoint> => {
|
78 |
+
if (!m.endpoints) {
|
79 |
+
return endpointTgi({
|
80 |
+
type: "tgi",
|
81 |
+
url: `${HF_API_ROOT}/${m.name}`,
|
82 |
+
accessToken: HF_ACCESS_TOKEN,
|
83 |
+
weight: 1,
|
84 |
+
model: m,
|
85 |
+
});
|
86 |
+
}
|
87 |
+
const totalWeight = sum(m.endpoints.map((e) => e.weight));
|
88 |
+
|
89 |
+
let random = Math.random() * totalWeight;
|
90 |
+
|
91 |
+
for (const endpoint of m.endpoints) {
|
92 |
+
if (random < endpoint.weight) {
|
93 |
+
const args = { ...endpoint, model: m };
|
94 |
+
if (args.type === "tgi") {
|
95 |
+
return endpoints.tgi(args);
|
96 |
+
} else if (args.type === "aws") {
|
97 |
+
return await endpoints.sagemaker(args);
|
98 |
+
} else if (args.type === "openai") {
|
99 |
+
return await endpoints.openai(args);
|
100 |
+
} else if (args.type === "llamacpp") {
|
101 |
+
return await endpoints.llamacpp(args);
|
102 |
+
} else {
|
103 |
+
// for legacy reason
|
104 |
+
return await endpoints.tgi(args);
|
105 |
+
}
|
106 |
+
}
|
107 |
+
random -= endpoint.weight;
|
108 |
+
}
|
109 |
+
|
110 |
+
throw new Error(`Failed to select endpoint`);
|
111 |
+
},
|
112 |
+
});
|
113 |
+
|
114 |
+
export const models = await Promise.all(modelsRaw.map((e) => processModel(e).then(addEndpoint)));
|
115 |
+
|
116 |
+
export const defaultModel = models[0];
|
117 |
|
118 |
// Models that have been deprecated
|
119 |
export const oldModels = OLD_MODELS
|
|
|
129 |
.map((m) => ({ ...m, id: m.id || m.name, displayName: m.displayName || m.name }))
|
130 |
: [];
|
131 |
|
|
|
|
|
132 |
export const validateModel = (_models: BackendModel[]) => {
|
133 |
// Zod enum function requires 2 parameters
|
134 |
return z.enum([_models[0].id, ..._models.slice(1).map((m) => m.id)]);
|
135 |
};
|
136 |
|
137 |
// if `TASK_MODEL` is the name of a model we use it, else we try to parse `TASK_MODEL` as a model config itself
|
138 |
+
|
139 |
export const smallModel = TASK_MODEL
|
140 |
+
? (models.find((m) => m.name === TASK_MODEL) ||
|
141 |
+
(await processModel(modelConfig.parse(JSON.parse(TASK_MODEL))).then((m) =>
|
142 |
+
addEndpoint(m)
|
143 |
+
))) ??
|
144 |
+
defaultModel
|
145 |
: defaultModel;
|
146 |
|
147 |
+
export type BackendModel = Optional<typeof defaultModel, "preprompt" | "parameters">;
|
|
@@ -1,6 +1,5 @@
|
|
1 |
import { LLM_SUMMERIZATION } from "$env/static/private";
|
2 |
import { generateFromDefaultEndpoint } from "$lib/server/generateFromDefaultEndpoint";
|
3 |
-
import { smallModel } from "$lib/server/models";
|
4 |
import type { Message } from "$lib/types/Message";
|
5 |
|
6 |
export async function summarize(prompt: string) {
|
@@ -23,17 +22,13 @@ export async function summarize(prompt: string) {
|
|
23 |
{ from: "assistant", content: "🎥 Favorite movie" },
|
24 |
{ from: "user", content: "Explain the concept of artificial intelligence in one sentence" },
|
25 |
{ from: "assistant", content: "🤖 AI definition" },
|
26 |
-
{ from: "user", content: "Answer all my questions like chewbacca from now ok?" },
|
27 |
-
{ from: "assistant", content: "🐒 Answer as Chewbacca" },
|
28 |
{ from: "user", content: prompt },
|
29 |
];
|
30 |
|
31 |
-
|
32 |
messages,
|
33 |
preprompt: `You are a summarization AI. You'll never answer a user's question directly, but instead summarize the user's request into a single short sentence of four words or less. Always start your answer with an emoji relevant to the summary.`,
|
34 |
-
})
|
35 |
-
|
36 |
-
return await generateFromDefaultEndpoint(summaryPrompt)
|
37 |
.then((summary) => {
|
38 |
// add an emoji if none is found in the first three characters
|
39 |
if (!/\p{Emoji}/u.test(summary.slice(0, 3))) {
|
|
|
1 |
import { LLM_SUMMERIZATION } from "$env/static/private";
|
2 |
import { generateFromDefaultEndpoint } from "$lib/server/generateFromDefaultEndpoint";
|
|
|
3 |
import type { Message } from "$lib/types/Message";
|
4 |
|
5 |
export async function summarize(prompt: string) {
|
|
|
22 |
{ from: "assistant", content: "🎥 Favorite movie" },
|
23 |
{ from: "user", content: "Explain the concept of artificial intelligence in one sentence" },
|
24 |
{ from: "assistant", content: "🤖 AI definition" },
|
|
|
|
|
25 |
{ from: "user", content: prompt },
|
26 |
];
|
27 |
|
28 |
+
return await generateFromDefaultEndpoint({
|
29 |
messages,
|
30 |
preprompt: `You are a summarization AI. You'll never answer a user's question directly, but instead summarize the user's request into a single short sentence of four words or less. Always start your answer with an emoji relevant to the summary.`,
|
31 |
+
})
|
|
|
|
|
32 |
.then((summary) => {
|
33 |
// add an emoji if none is found in the first three characters
|
34 |
if (!/\p{Emoji}/u.test(summary.slice(0, 3))) {
|
@@ -1,7 +1,6 @@
|
|
1 |
import type { Message } from "$lib/types/Message";
|
2 |
import { format } from "date-fns";
|
3 |
import { generateFromDefaultEndpoint } from "../generateFromDefaultEndpoint";
|
4 |
-
import { smallModel } from "../models";
|
5 |
|
6 |
export async function generateQuery(messages: Message[]) {
|
7 |
const currentDate = format(new Date(), "MMMM d, yyyy");
|
@@ -62,10 +61,8 @@ Current Question: Where is it being hosted ?`,
|
|
62 |
},
|
63 |
];
|
64 |
|
65 |
-
|
66 |
-
preprompt: `You are tasked with generating web search queries. Give me an appropriate query to answer my question for google search. Answer with only the query. Today is ${currentDate}`,
|
67 |
messages: convQuery,
|
|
|
68 |
});
|
69 |
-
|
70 |
-
return await generateFromDefaultEndpoint(promptQuery);
|
71 |
}
|
|
|
1 |
import type { Message } from "$lib/types/Message";
|
2 |
import { format } from "date-fns";
|
3 |
import { generateFromDefaultEndpoint } from "../generateFromDefaultEndpoint";
|
|
|
4 |
|
5 |
export async function generateQuery(messages: Message[]) {
|
6 |
const currentDate = format(new Date(), "MMMM d, yyyy");
|
|
|
61 |
},
|
62 |
];
|
63 |
|
64 |
+
return await generateFromDefaultEndpoint({
|
|
|
65 |
messages: convQuery,
|
66 |
+
preprompt: `You are tasked with generating web search queries. Give me an appropriate query to answer my question for google search. Answer with only the query. Today is ${currentDate}`,
|
67 |
});
|
|
|
|
|
68 |
}
|
@@ -1,6 +0,0 @@
|
|
1 |
-
export function trimPrefix(input: string, prefix: string) {
|
2 |
-
if (input.startsWith(prefix)) {
|
3 |
-
return input.slice(prefix.length);
|
4 |
-
}
|
5 |
-
return input;
|
6 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -1,6 +0,0 @@
|
|
1 |
-
export function trimSuffix(input: string, end: string): string {
|
2 |
-
if (input.endsWith(end)) {
|
3 |
-
return input.slice(0, input.length - end.length);
|
4 |
-
}
|
5 |
-
return input;
|
6 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@@ -171,6 +171,8 @@
|
|
171 |
convId: $page.params.id,
|
172 |
};
|
173 |
}
|
|
|
|
|
174 |
}
|
175 |
}
|
176 |
} catch (parseError) {
|
|
|
171 |
convId: $page.params.id,
|
172 |
};
|
173 |
}
|
174 |
+
} else if (update.status === "error") {
|
175 |
+
$error = update.message ?? "An error has occurred";
|
176 |
}
|
177 |
}
|
178 |
} catch (parseError) {
|
@@ -1,26 +1,19 @@
|
|
1 |
-
import {
|
2 |
-
import { buildPrompt } from "$lib/buildPrompt";
|
3 |
-
import { PUBLIC_SEP_TOKEN } from "$lib/constants/publicSepToken";
|
4 |
import { authCondition, requiresUser } from "$lib/server/auth";
|
5 |
import { collections } from "$lib/server/database";
|
6 |
-
import { modelEndpoint } from "$lib/server/modelEndpoint";
|
7 |
import { models } from "$lib/server/models";
|
8 |
import { ERROR_MESSAGES } from "$lib/stores/errors";
|
9 |
import type { Message } from "$lib/types/Message";
|
10 |
-
import { trimPrefix } from "$lib/utils/trimPrefix";
|
11 |
-
import { trimSuffix } from "$lib/utils/trimSuffix";
|
12 |
-
import { textGenerationStream } from "@huggingface/inference";
|
13 |
import { error } from "@sveltejs/kit";
|
14 |
import { ObjectId } from "mongodb";
|
15 |
import { z } from "zod";
|
16 |
-
import { AwsClient } from "aws4fetch";
|
17 |
import type { MessageUpdate } from "$lib/types/MessageUpdate";
|
18 |
import { runWebSearch } from "$lib/server/websearch/runWebSearch";
|
19 |
import type { WebSearch } from "$lib/types/WebSearch";
|
20 |
import { abortedGenerations } from "$lib/server/abortedGenerations";
|
21 |
import { summarize } from "$lib/server/summarize";
|
22 |
|
23 |
-
export async function POST({ request,
|
24 |
const id = z.string().parse(params.id);
|
25 |
const convId = new ObjectId(id);
|
26 |
const promptedAt = new Date();
|
@@ -191,138 +184,90 @@ export async function POST({ request, fetch, locals, params, getClientAddress })
|
|
191 |
webSearchResults = await runWebSearch(conv, newPrompt, update);
|
192 |
}
|
193 |
|
194 |
-
|
195 |
-
|
196 |
-
|
197 |
-
|
198 |
-
|
199 |
-
|
200 |
-
|
201 |
-
|
202 |
-
|
203 |
-
|
204 |
-
|
205 |
-
|
206 |
-
|
207 |
-
|
208 |
-
|
209 |
-
|
210 |
-
|
211 |
-
|
212 |
-
|
213 |
-
|
214 |
-
|
215 |
-
|
216 |
-
|
217 |
-
|
218 |
-
|
219 |
-
|
220 |
-
|
221 |
-
|
222 |
-
|
223 |
-
|
224 |
-
|
225 |
-
|
226 |
-
|
227 |
-
|
228 |
-
|
229 |
-
|
230 |
-
|
231 |
-
|
232 |
-
|
233 |
-
|
234 |
-
|
235 |
-
|
236 |
-
|
237 |
-
|
238 |
-
|
239 |
-
|
240 |
-
|
|
|
|
|
241 |
}
|
242 |
-
}
|
243 |
-
|
244 |
-
|
245 |
-
|
246 |
-
|
247 |
-
|
248 |
-
|
249 |
-
|
250 |
-
$set: {
|
251 |
-
messages,
|
252 |
-
title: conv.title,
|
253 |
updatedAt: new Date(),
|
254 |
},
|
255 |
-
|
256 |
-
|
257 |
-
|
258 |
-
update({
|
259 |
-
type: "finalAnswer",
|
260 |
-
text: generated_text,
|
261 |
-
});
|
262 |
}
|
|
|
|
|
|
|
263 |
}
|
264 |
-
|
265 |
-
const tokenStream = textGenerationStream(
|
266 |
{
|
267 |
-
|
268 |
-
...models.find((m) => m.id === conv.model)?.parameters,
|
269 |
-
return_full_text: false,
|
270 |
-
},
|
271 |
-
model: randomEndpoint.url,
|
272 |
-
inputs: prompt,
|
273 |
-
accessToken: randomEndpoint.host === "sagemaker" ? undefined : HF_ACCESS_TOKEN,
|
274 |
},
|
275 |
{
|
276 |
-
|
277 |
-
|
|
|
|
|
|
|
278 |
}
|
279 |
);
|
280 |
|
281 |
-
|
282 |
-
|
283 |
-
|
284 |
-
|
285 |
-
if (!output.token.special) {
|
286 |
-
const lastMessage = messages[messages.length - 1];
|
287 |
-
update({
|
288 |
-
type: "stream",
|
289 |
-
token: output.token.text,
|
290 |
-
});
|
291 |
-
|
292 |
-
// if the last message is not from assistant, it means this is the first token
|
293 |
-
if (lastMessage?.from !== "assistant") {
|
294 |
-
// so we create a new message
|
295 |
-
messages = [
|
296 |
-
...messages,
|
297 |
-
// id doesn't match the backend id but it's not important for assistant messages
|
298 |
-
// First token has a space at the beginning, trim it
|
299 |
-
{
|
300 |
-
from: "assistant",
|
301 |
-
content: output.token.text.trimStart(),
|
302 |
-
webSearch: webSearchResults,
|
303 |
-
updates: updates,
|
304 |
-
id: (responseId as Message["id"]) || crypto.randomUUID(),
|
305 |
-
createdAt: new Date(),
|
306 |
-
updatedAt: new Date(),
|
307 |
-
},
|
308 |
-
];
|
309 |
-
} else {
|
310 |
-
const date = abortedGenerations.get(convId.toString());
|
311 |
-
if (date && date > promptedAt) {
|
312 |
-
saveLast(lastMessage.content);
|
313 |
-
}
|
314 |
-
if (!output) {
|
315 |
-
break;
|
316 |
-
}
|
317 |
-
|
318 |
-
// otherwise we just concatenate tokens
|
319 |
-
lastMessage.content += output.token.text;
|
320 |
-
}
|
321 |
-
}
|
322 |
-
} else {
|
323 |
-
saveLast(output.generated_text);
|
324 |
-
}
|
325 |
-
}
|
326 |
},
|
327 |
async cancel() {
|
328 |
await collections.conversations.updateOne(
|
|
|
1 |
+
import { MESSAGES_BEFORE_LOGIN, RATE_LIMIT } from "$env/static/private";
|
|
|
|
|
2 |
import { authCondition, requiresUser } from "$lib/server/auth";
|
3 |
import { collections } from "$lib/server/database";
|
|
|
4 |
import { models } from "$lib/server/models";
|
5 |
import { ERROR_MESSAGES } from "$lib/stores/errors";
|
6 |
import type { Message } from "$lib/types/Message";
|
|
|
|
|
|
|
7 |
import { error } from "@sveltejs/kit";
|
8 |
import { ObjectId } from "mongodb";
|
9 |
import { z } from "zod";
|
|
|
10 |
import type { MessageUpdate } from "$lib/types/MessageUpdate";
|
11 |
import { runWebSearch } from "$lib/server/websearch/runWebSearch";
|
12 |
import type { WebSearch } from "$lib/types/WebSearch";
|
13 |
import { abortedGenerations } from "$lib/server/abortedGenerations";
|
14 |
import { summarize } from "$lib/server/summarize";
|
15 |
|
16 |
+
export async function POST({ request, locals, params, getClientAddress }) {
|
17 |
const id = z.string().parse(params.id);
|
18 |
const convId = new ObjectId(id);
|
19 |
const promptedAt = new Date();
|
|
|
184 |
webSearchResults = await runWebSearch(conv, newPrompt, update);
|
185 |
}
|
186 |
|
187 |
+
messages[messages.length - 1].webSearch = webSearchResults;
|
188 |
+
|
189 |
+
conv.messages = messages;
|
190 |
+
|
191 |
+
try {
|
192 |
+
const endpoint = await model.getEndpoint();
|
193 |
+
for await (const output of await endpoint({ conversation: conv })) {
|
194 |
+
// if not generated_text is here it means the generation is not done
|
195 |
+
if (!output.generated_text) {
|
196 |
+
// else we get the next token
|
197 |
+
if (!output.token.special) {
|
198 |
+
update({
|
199 |
+
type: "stream",
|
200 |
+
token: output.token.text,
|
201 |
+
});
|
202 |
+
|
203 |
+
// if the last message is not from assistant, it means this is the first token
|
204 |
+
const lastMessage = messages[messages.length - 1];
|
205 |
+
|
206 |
+
if (lastMessage?.from !== "assistant") {
|
207 |
+
// so we create a new message
|
208 |
+
messages = [
|
209 |
+
...messages,
|
210 |
+
// id doesn't match the backend id but it's not important for assistant messages
|
211 |
+
// First token has a space at the beginning, trim it
|
212 |
+
{
|
213 |
+
from: "assistant",
|
214 |
+
content: output.token.text.trimStart(),
|
215 |
+
webSearch: webSearchResults,
|
216 |
+
updates: updates,
|
217 |
+
id: (responseId as Message["id"]) || crypto.randomUUID(),
|
218 |
+
createdAt: new Date(),
|
219 |
+
updatedAt: new Date(),
|
220 |
+
},
|
221 |
+
];
|
222 |
+
} else {
|
223 |
+
// abort check
|
224 |
+
const date = abortedGenerations.get(convId.toString());
|
225 |
+
if (date && date > promptedAt) {
|
226 |
+
break;
|
227 |
+
}
|
228 |
+
|
229 |
+
if (!output) {
|
230 |
+
break;
|
231 |
+
}
|
232 |
+
|
233 |
+
// otherwise we just concatenate tokens
|
234 |
+
lastMessage.content += output.token.text;
|
235 |
+
}
|
236 |
}
|
237 |
+
} else {
|
238 |
+
// add output.generated text to the last message
|
239 |
+
messages = [
|
240 |
+
...messages.slice(0, -1),
|
241 |
+
{
|
242 |
+
...messages[messages.length - 1],
|
243 |
+
content: output.generated_text,
|
244 |
+
updates: updates,
|
|
|
|
|
|
|
245 |
updatedAt: new Date(),
|
246 |
},
|
247 |
+
];
|
248 |
+
}
|
|
|
|
|
|
|
|
|
|
|
249 |
}
|
250 |
+
} catch (e) {
|
251 |
+
console.error(e);
|
252 |
+
update({ type: "status", status: "error", message: (e as Error).message });
|
253 |
}
|
254 |
+
await collections.conversations.updateOne(
|
|
|
255 |
{
|
256 |
+
_id: convId,
|
|
|
|
|
|
|
|
|
|
|
|
|
257 |
},
|
258 |
{
|
259 |
+
$set: {
|
260 |
+
messages,
|
261 |
+
title: conv?.title,
|
262 |
+
updatedAt: new Date(),
|
263 |
+
},
|
264 |
}
|
265 |
);
|
266 |
|
267 |
+
update({
|
268 |
+
type: "finalAnswer",
|
269 |
+
text: messages[messages.length - 1].content,
|
270 |
+
});
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
271 |
},
|
272 |
async cancel() {
|
273 |
await collections.conversations.updateOne(
|