FrancescoPeriti commited on
Commit
8954644
·
verified ·
1 Parent(s): 02223d3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +123 -2
README.md CHANGED
@@ -1,15 +1,22 @@
1
  ---
2
  library_name: transformers
3
- tags: []
4
  ---
5
 
6
- # Model Card for Model ID
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
 
 
9
 
 
10
 
11
 
12
  ## Model Details
 
 
 
 
13
 
14
  ### Model Description
15
 
@@ -49,6 +56,120 @@ This is the model card of a 🤗 transformers model that has been pushed on the
49
 
50
  [More Information Needed]
51
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
  ### Out-of-Scope Use
53
 
54
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
1
  ---
2
  library_name: transformers
3
+ tags: [Llama2Dictionary]
4
  ---
5
 
6
+ # Llama2Dictionary
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
+ ```FrancescoPeriti/Llama2Dictionary``` is a fine-tuned version of the ```meta-llama/Llama-2-7b-chat-hf```.
10
+ Thus, to use it, visit the AI at Meta website, accept the Meta License, and submit the [form](https://llama.meta.com/llama-downloads/).
11
 
12
+ To use ```FrancescoPeriti/Llama2Dictionary```, you will need to login with your hugginface token (hereonforth, ```[HF-TOKEN]```).
13
 
14
 
15
  ## Model Details
16
+ This model is fine-tuned on English datasets of sense definitions. Given a target word and a usage example, the model generates a sense definition for the target word in-context.
17
+
18
+ You can find more details in the paper [Automatically Generated Definitions and their utility for Modeling Word Meaning](link) by Francesco Periti, David Alfter, Nina Tahmasebi.
19
+
20
 
21
  ### Model Description
22
 
 
56
 
57
  [More Information Needed]
58
 
59
+ ```python
60
+ import torch
61
+ import warnings
62
+ from peft import PeftModel # parameter-efficient fine-tuning
63
+ from datasets import Dataset
64
+ from huggingface_hub import login
65
+ from typing import (Literal, Sequence,TypedDict)
66
+ from transformers import AutoTokenizer, AutoModelForCausalLM
67
+
68
+ login([HF-TOKEN]) # e.g., hf_aGPI...ELal
69
+
70
+ model_name = "meta-llama/Llama-2-7b-chat-hf" # chat model
71
+ ft_model_name = "FrancescoPeriti/Llama2Dictionary" # fine-tuned model
72
+
73
+ # load models
74
+ chat_model = AutoModelForCausalLM.from_pretrained(model_name, device_map='auto')
75
+ lama2dictionary = PeftModel.from_pretrained(chat_model, ft_model_name)
76
+ lama2dictionary.eval()
77
+
78
+ # load tokenizer
79
+ tokenizer = AutoTokenizer.from_pretrained(
80
+ model_name,
81
+ padding_side="left",
82
+ add_eos_token=True,
83
+ add_bos_token=True,
84
+ )
85
+ tokenizer.pad_token = tokenizer.eos_token
86
+
87
+ # end of sequence for stop condition
88
+ eos_tokens = [tokenizer.encode(token, add_special_tokens=False)[0]
89
+ for token in [';', ' ;', '.', ' .']]
90
+ eos_tokens.append(tokenizer.eos_token_id)
91
+
92
+ # chat format
93
+ Role = Literal["system", "user"]
94
+
95
+ class Message(TypedDict):
96
+ role: Role
97
+ content: str
98
+
99
+ Dialog = Sequence[Message]
100
+
101
+ # load dataset
102
+ examples = [{'target': 'jam', 'example': 'The traffic jam on the highway made everyone late for work.'},
103
+ {'target': 'jam', 'example': 'I spread a generous layer of strawberry jam on my toast this morning'}]
104
+ dataset = Dataset.from_list(examples)
105
+
106
+ # apply template
107
+ def apply_chat_template(tokenizer, dataset):
108
+ system_message = "You are a lexicographer familiar with providing concise definitions of word meanings."
109
+ template = 'Please provide a concise definition for the meaning of the word "{}" in the following sentence: {}'
110
+
111
+ def apply_chat_template_func(record):
112
+ dialog: Dialog = (Message(role='system', content=system_message),
113
+ Message(role='user', content=template.format(record['target'], record['example'])))
114
+ prompt = tokenizer.decode(tokenizer.apply_chat_template(dialog, add_generation_prompt=True))
115
+ return {'text': prompt}
116
+
117
+ return dataset.map(apply_chat_template_func)
118
+
119
+ dataset = apply_chat_template(tokenizer, dataset)
120
+
121
+ # tokenization
122
+ max_length = 512
123
+
124
+ def formatting_func(record):
125
+ return record['text']
126
+
127
+ def tokenization(dataset):
128
+ result = tokenizer(formatting_func(dataset),
129
+ truncation=True,
130
+ max_length=max_length,
131
+ padding="max_length",
132
+ add_special_tokens=False)
133
+ return result
134
+
135
+ tokenized_dataset = dataset.map(tokenization)
136
+
137
+ # definition generation
138
+ batch_size = 32
139
+ max_time = 4.5 # sec
140
+
141
+ sense_definitions = list()
142
+ with torch.no_grad():
143
+ for i in range(0, len(tokenized_dataset), batch_size):
144
+ batch = tokenized_test_dataset[i:i + batch_size]
145
+
146
+ model_input = dict()
147
+ for k in ['input_ids', 'attention_mask']:
148
+ model_input[k] = torch.tensor(batch[k]).to('cuda')
149
+
150
+ output_ids = ft_model.generate(**model_input,
151
+ max_length = max_length * batch_size,
152
+ forced_eos_token_id = eos_tokens,
153
+ max_time = max_time * batch_size,
154
+ eos_token_id = eos_tokens,
155
+ temperature = 0.00001,
156
+ pad_token_id = tokenizer.eos_token_id)
157
+
158
+ answers = tokenizer.batch_decode(output_ids, skip_special_tokens=True)
159
+
160
+ for j, answer in enumerate(answers):
161
+ answer = answer.split('[/INST]')[-1].strip(" .,;:")
162
+ if 'SYS>>' in answer:
163
+ answer=''
164
+ warnings.warn("Something went wrong. The input example might be too long; try reducing it.")
165
+ sense_definitions.append(answer.replace('\n', ' ') + '\n')
166
+
167
+ # output
168
+ dataset = dataset.add_column('definition', output)
169
+ for row in dataset:
170
+ print(f"Target: {row['target']}\nExample: {row['example']}\nSense definition: {row['definition']}")
171
+ ```
172
+
173
  ### Out-of-Scope Use
174
 
175
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->