Update README.md
Browse files
README.md
CHANGED
@@ -11,11 +11,37 @@ inference: false
|
|
11 |
thumbnail: https://h2o.ai/etc.clientlibs/h2o/clientlibs/clientlib-site/resources/images/favicon.ico
|
12 |
---
|
13 |
# Model Card
|
|
|
|
|
|
|
14 |
## Summary
|
15 |
|
16 |
This model was trained using [H2O LLM Studio](https://github.com/h2oai/h2o-llmstudio).
|
17 |
- Base model: [h2oai/h2o-danube2-1.8b-chat](https://huggingface.co/h2oai/h2o-danube2-1.8b-chat)
|
18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
## Usage
|
21 |
|
@@ -25,14 +51,6 @@ To use the model with the `transformers` library on a machine with GPUs, first m
|
|
25 |
pip install transformers==4.38.2
|
26 |
```
|
27 |
|
28 |
-
Also make sure you are providing your huggingface token to the pipeline if the model is lying in a private repo.
|
29 |
-
- Either leave `token=True` in the `pipeline` and login to hugginface_hub by running
|
30 |
-
```python
|
31 |
-
import huggingface_hub
|
32 |
-
huggingface_hub.login(<ACCESS_TOKEN>)
|
33 |
-
```
|
34 |
-
- Or directly pass your <ACCESS_TOKEN> to `token` in the `pipeline`
|
35 |
-
|
36 |
```python
|
37 |
from transformers import pipeline
|
38 |
|
@@ -58,16 +76,6 @@ res = generate_text(
|
|
58 |
print(res[0]["generated_text"])
|
59 |
```
|
60 |
|
61 |
-
You can print a sample prompt after the preprocessing step to see how it is feed to the tokenizer:
|
62 |
-
|
63 |
-
```python
|
64 |
-
print(generate_text.preprocess("Why is drinking water so healthy?")["prompt_text"])
|
65 |
-
```
|
66 |
-
|
67 |
-
```bash
|
68 |
-
<|prompt|>Why is drinking water so healthy?</s><|answer|>
|
69 |
-
```
|
70 |
-
|
71 |
Alternatively, you can download [h2oai_pipeline.py](h2oai_pipeline.py), store it alongside your notebook, and construct the pipeline yourself from the loaded model and tokenizer. If the model and the tokenizer are fully supported in the `transformers` package, this will allow you to set `trust_remote_code=False`.
|
72 |
|
73 |
```python
|
@@ -144,40 +152,15 @@ answer = tokenizer.decode(tokens, skip_special_tokens=True)
|
|
144 |
print(answer)
|
145 |
```
|
146 |
|
147 |
-
##
|
148 |
-
|
149 |
-
|
150 |
-
|
151 |
-
## Model Architecture
|
152 |
|
153 |
-
|
154 |
-
|
155 |
-
|
156 |
-
|
157 |
-
(layers): ModuleList(
|
158 |
-
(0-23): 24 x MistralDecoderLayer(
|
159 |
-
(self_attn): MistralSdpaAttention(
|
160 |
-
(q_proj): Linear(in_features=2560, out_features=2560, bias=False)
|
161 |
-
(k_proj): Linear(in_features=2560, out_features=640, bias=False)
|
162 |
-
(v_proj): Linear(in_features=2560, out_features=640, bias=False)
|
163 |
-
(o_proj): Linear(in_features=2560, out_features=2560, bias=False)
|
164 |
-
(rotary_emb): MistralRotaryEmbedding()
|
165 |
-
)
|
166 |
-
(mlp): MistralMLP(
|
167 |
-
(gate_proj): Linear(in_features=2560, out_features=6912, bias=False)
|
168 |
-
(up_proj): Linear(in_features=2560, out_features=6912, bias=False)
|
169 |
-
(down_proj): Linear(in_features=6912, out_features=2560, bias=False)
|
170 |
-
(act_fn): SiLU()
|
171 |
-
)
|
172 |
-
(input_layernorm): MistralRMSNorm()
|
173 |
-
(post_attention_layernorm): MistralRMSNorm()
|
174 |
-
)
|
175 |
-
)
|
176 |
-
(norm): MistralRMSNorm()
|
177 |
-
)
|
178 |
-
(lm_head): Linear(in_features=2560, out_features=32000, bias=False)
|
179 |
-
)
|
180 |
-
```
|
181 |
|
182 |
## Model Configuration
|
183 |
|
|
|
11 |
thumbnail: https://h2o.ai/etc.clientlibs/h2o/clientlibs/clientlib-site/resources/images/favicon.ico
|
12 |
---
|
13 |
# Model Card
|
14 |
+
# Making a custom Singapore-specific GPT using the H2O Gen AI Stack
|
15 |
+
<img src=https://cdn-uploads.huggingface.co/production/uploads/65672b0c54c6bc01e55e592f/_nqcMgjd3PBlj_EbjBjoe.png width=60% height=40%>
|
16 |
+
|
17 |
## Summary
|
18 |
|
19 |
This model was trained using [H2O LLM Studio](https://github.com/h2oai/h2o-llmstudio).
|
20 |
- Base model: [h2oai/h2o-danube2-1.8b-chat](https://huggingface.co/h2oai/h2o-danube2-1.8b-chat)
|
21 |
|
22 |
+
## Examples
|
23 |
+
Sample response comparison between the base model and the current tuned model.
|
24 |
+
|
25 |
+
Sample question:
|
26 |
+
```
|
27 |
+
What’s hawker centres ah?
|
28 |
+
```
|
29 |
+
|
30 |
+
Our model's response:
|
31 |
+
|
32 |
+
```
|
33 |
+
Hawker centers, also call food stalls or street vendors, are those small and shiok places in cities where you can find all kind of local delicacies. They sell cheap but very good eats like noodles, bak chor mee (noodle soup), sotan goh (deep-fried taro cake) and more!
|
34 |
+
These marketplaces got many different types of traders selling their own original dishes. Aiyoh, so must try the famous ones leh! Many people come here to eat for a quick hit of flavors during lunchtime or after work hours.
|
35 |
+
Hawker center usually located near important areas with lots of foot traffic. Some popular ones include Chinatown, Little India, Bugis Street Food Centre, Maxwell Road Market, and Cantonment Complex. Each one have its specialty dish that make them stand out from others.
|
36 |
+
You know what? I tell you already lah, go check these places out when you get chance to visit Asian countries! You never know which delicious treat you might discover there. So much fun…just imagine how social life will be if every city has something similar only spread around with this great concept of sharing culture through eating together at affordable rates. Shiok sia!
|
37 |
+
```
|
38 |
+
|
39 |
+
Base model's response:
|
40 |
+
|
41 |
+
```
|
42 |
+
Hawker centers, also known as food courts or open-air markets in some countries, are bustling hubs of street food vendors and small restaurants that offer a wide range of affordable meals to locals and tourists alike. They typically feature various stalls selling different types of dishes such as rice dishes, noodles, stir fries, grilled meats, sandwiches, desserts, and more.
|
43 |
+
Hawker centers can be found throughout Singapore, Malaysia, Thailand, Indonesia, Hong Kong, Taiwan, Japan, South Korea, Vietnam, China, the Philippines, Myanmar (Burma), Cambodia, Laos, and other Southeast Asian countries with significant populations of ethnic Chinese people who enjoy these unique culinary experiences. Visiting one is an excellent way to experience local flavors while supporting the community’s vibrant culture through casual dining at reasonable prices!
|
44 |
+
```
|
45 |
|
46 |
## Usage
|
47 |
|
|
|
51 |
pip install transformers==4.38.2
|
52 |
```
|
53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
54 |
```python
|
55 |
from transformers import pipeline
|
56 |
|
|
|
76 |
print(res[0]["generated_text"])
|
77 |
```
|
78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
Alternatively, you can download [h2oai_pipeline.py](h2oai_pipeline.py), store it alongside your notebook, and construct the pipeline yourself from the loaded model and tokenizer. If the model and the tokenizer are fully supported in the `transformers` package, this will allow you to set `trust_remote_code=False`.
|
80 |
|
81 |
```python
|
|
|
152 |
print(answer)
|
153 |
```
|
154 |
|
155 |
+
## Evaluations
|
156 |
+
We evaluated our model using two Singlish translated benchmarks in 0-shot (validation sets):
|
157 |
+
- Singlish translated [ARC-easy](https://huggingface.co/datasets/allenai/ai2_arc) (510 rows)
|
158 |
+
- Singlish translated [PiQA](https://huggingface.co/datasets/piqa) (1677 rows)
|
|
|
159 |
|
160 |
+
| Benchmark | Base model (acc_n) | Tuned model (acc_n) |
|
161 |
+
|:------------------------|:------------------:|:-------------------:|
|
162 |
+
| ARC-easy-translated | 0.6157 | **0.6314** |
|
163 |
+
| PiQA-translated | 0.6601 | **0.6959** |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
164 |
|
165 |
## Model Configuration
|
166 |
|