Text Generation
Transformers
GGUF
reranker
Inference Endpoints
conversational
aashish1904 commited on
Commit
0716ff1
·
verified ·
1 Parent(s): 16cdeac

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +315 -0
README.md ADDED
@@ -0,0 +1,315 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ library_name: transformers
5
+ license: apache-2.0
6
+ language:
7
+ - en
8
+ - zh
9
+ - es
10
+ - de
11
+ - ar
12
+ - ru
13
+ - ja
14
+ - ko
15
+ - hi
16
+ - sk
17
+ - vi
18
+ - tr
19
+ - fi
20
+ - id
21
+ - fa
22
+ - 'no'
23
+ - th
24
+ - sv
25
+ - pt
26
+ - da
27
+ - bn
28
+ - te
29
+ - ro
30
+ - it
31
+ - fr
32
+ - nl
33
+ - sw
34
+ - pl
35
+ - hu
36
+ - cs
37
+ - el
38
+ - uk
39
+ - mr
40
+ - ta
41
+ - tl
42
+ - bg
43
+ - lt
44
+ - ur
45
+ - he
46
+ - gu
47
+ - kn
48
+ - am
49
+ - kk
50
+ - hr
51
+ - uz
52
+ - jv
53
+ - ca
54
+ - az
55
+ - ms
56
+ - sr
57
+ - sl
58
+ - yo
59
+ - lv
60
+ - is
61
+ - ha
62
+ - ka
63
+ - et
64
+ - bs
65
+ - hy
66
+ - ml
67
+ - pa
68
+ - mt
69
+ - km
70
+ - sq
71
+ - or
72
+ - as
73
+ - my
74
+ - mn
75
+ - af
76
+ - be
77
+ - ga
78
+ - mk
79
+ - cy
80
+ - gl
81
+ - ceb
82
+ - la
83
+ - yi
84
+ - lb
85
+ - tg
86
+ - gd
87
+ - ne
88
+ - ps
89
+ - eu
90
+ - ky
91
+ - ku
92
+ - si
93
+ - ht
94
+ - eo
95
+ - lo
96
+ - fy
97
+ - sd
98
+ - mg
99
+ - so
100
+ - ckb
101
+ - su
102
+ - nn
103
+ datasets:
104
+ - lightblue/reranker_continuous_filt_max7_train
105
+ base_model:
106
+ - Qwen/Qwen2.5-0.5B-Instruct
107
+ pipeline_tag: text-generation
108
+ tags:
109
+ - reranker
110
+
111
+ ---
112
+
113
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
114
+
115
+
116
+ # QuantFactory/lb-reranker-0.5B-v1.0-GGUF
117
+ This is quantized version of [lightblue/lb-reranker-0.5B-v1.0](https://huggingface.co/lightblue/lb-reranker-0.5B-v1.0) created using llama.cpp
118
+
119
+ # Original Model Card
120
+
121
+
122
+ # LB Reranker v1.0
123
+
124
+ <div style="width: 100%; height: 160px;
125
+ display: flex; align-items: center;
126
+ justify-content: center;
127
+ border: 8px solid black;
128
+ font-size: 120px; font-weight: bold;
129
+ text-align: center;
130
+ color: #438db8;
131
+ font-family: 'Helvetica Neue', sans-serif;">
132
+ LBR
133
+ </div>
134
+
135
+ The LB Reranker has been trained to determine the relatedness of a given query to a piece of text, therefore allowing it to be used as a ranker or reranker in various retrieval-based tasks.
136
+
137
+ This model is fine-tuned from a [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) model checkpoint and was trained for roughly 5.5 hours using the 8 x L20 instance ([ecs.gn8is-8x.32xlarge](https://www.alibabacloud.com/help/en/ecs/user-guide/gpu-accelerated-compute-optimized-and-vgpu-accelerated-instance-families-1)) on [Alibaba Cloud](https://www.alibabacloud.com/).
138
+
139
+ The training data for this model can be found at [lightblue/reranker_continuous_filt_max7_train](https://huggingface.co/datasets/lightblue/reranker_continuous_filt_max7_train) and the code for generating this data as well as running the training of the model can be found on [our Github repo](https://github.com/lightblue-tech/lb-reranker).
140
+
141
+ Trained on data in over 95 languages, this model is applicable to a broad range of use cases.
142
+
143
+ This model has three main benefits over comparable rerankers.
144
+ 1. It has shown slightly higher performance on evaluation benchmarks.
145
+ 2. It has been trained on more languages than any previous model.
146
+ 3. It is a simple Causal LM model trained to output a string between "1" and "7".
147
+
148
+ This last point means that this model can be used natively with many widely available inference packages, including vLLM and LMDeploy.
149
+ This in turns allows our reranker to benefit from improvements to inference as and when these packages release them.
150
+
151
+ Update: We have also found that this model works pretty well as a code snippet reranker too (P@1 of 96%)! See our [Colab](https://colab.research.google.com/drive/1ABL1xaarekLIlVJKbniYhXgYu6ZNwfBm?usp=sharing) for more details.
152
+
153
+ # How to use
154
+
155
+ The model was trained to expect an input such as:
156
+
157
+ ```
158
+ <<<Query>>>
159
+ {your_query_here}
160
+
161
+ <<<Context>>>
162
+ {your_context_here}
163
+ ```
164
+
165
+ And to output a string of a number between 1-7.
166
+
167
+ In order to make a continuous score that can be used for reranking query-context pairs (i.e. a method with few ties), we calculate the expectation value of the scores.
168
+
169
+ We include scripts to do this in both vLLM and LMDeploy:
170
+
171
+ #### vLLM
172
+
173
+ Install [vLLM](https://github.com/vllm-project/vllm/) using `pip install vllm`.
174
+
175
+ ```python
176
+ from vllm import LLM, SamplingParams
177
+ import numpy as np
178
+
179
+ def make_reranker_input(t, q):
180
+ return f"<<<Query>>>\n{q}\n\n<<<Context>>>\n{t}"
181
+
182
+ def make_reranker_training_datum(context, question):
183
+ system_message = "Given a query and a piece of text, output a score of 1-7 based on how related the query is to the text. 1 means least related and 7 is most related."
184
+
185
+ return [
186
+ {"role": "system", "content": system_message},
187
+ {"role": "user", "content": make_reranker_input(context, question)},
188
+ ]
189
+
190
+ def get_prob(logprob_dict, tok_id):
191
+ return np.exp(logprob_dict[tok_id].logprob) if tok_id in logprob_dict.keys() else 0
192
+
193
+ llm = LLM("lightblue/lb-reranker-v1.0")
194
+ sampling_params = SamplingParams(temperature=0.0, logprobs=14, max_tokens=1)
195
+ tok = llm.llm_engine.tokenizer.tokenizer
196
+ idx_tokens = [tok.encode(str(i))[0] for i in range(1, 8)]
197
+
198
+ query_texts = [
199
+ ("What is the scientific name of apples?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."),
200
+ ("What is the Chinese word for 'apple'?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."),
201
+ ("What is the square root of 999?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."),
202
+ ]
203
+
204
+ chats = [make_reranker_training_datum(c, q) for q, c in query_texts]
205
+ responses = llm.chat(chats, sampling_params)
206
+ probs = np.array([[get_prob(r.outputs[0].logprobs[0], y) for y in idx_tokens] for r in responses])
207
+
208
+ N = probs.shape[1]
209
+ M = probs.shape[0]
210
+ idxs = np.tile(np.arange(1, N + 1), M).reshape(M, N)
211
+
212
+ expected_vals = (probs * idxs).sum(axis=1)
213
+ print(expected_vals)
214
+ # [6.66570732 1.86686378 1.01102923]
215
+ ```
216
+
217
+ #### LMDeploy
218
+
219
+ Install [LMDeploy](https://github.com/InternLM/lmdeploy) using `pip install lmdeploy`.
220
+
221
+ ```python
222
+ # Un-comment this if running in a Jupyter notebook, Colab etc.
223
+ # import nest_asyncio
224
+ # nest_asyncio.apply()
225
+
226
+ from lmdeploy import GenerationConfig, ChatTemplateConfig, pipeline
227
+ import numpy as np
228
+
229
+ def make_reranker_input(t, q):
230
+ return f"<<<Query>>>\n{q}\n\n<<<Context>>>\n{t}"
231
+
232
+ def make_reranker_training_datum(context, question):
233
+ system_message = "Given a query and a piece of text, output a score of 1-7 based on how related the query is to the text. 1 means least related and 7 is most related."
234
+
235
+ return [
236
+ {"role": "system", "content": system_message},
237
+ {"role": "user", "content": make_reranker_input(context, question)},
238
+ ]
239
+
240
+ def get_prob(logprob_dict, tok_id):
241
+ return np.exp(logprob_dict[tok_id]) if tok_id in logprob_dict.keys() else 0
242
+
243
+ pipe = pipeline(
244
+ "lightblue/lb-reranker-v1.0",
245
+ chat_template_config=ChatTemplateConfig(
246
+ model_name='qwen2d5',
247
+ capability='chat'
248
+ )
249
+ )
250
+ tok = pipe.tokenizer.model
251
+ idx_tokens = [tok.encode(str(i))[0] for i in range(1, 8)]
252
+
253
+ query_texts = [
254
+ ("What is the scientific name of apples?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."),
255
+ ("What is the Chinese word for 'apple'?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."),
256
+ ("What is the square root of 999?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."),
257
+ ]
258
+
259
+ chats = [make_reranker_training_datum(c, q) for q, c in query_texts]
260
+ responses = pipe(
261
+ chats,
262
+ gen_config=GenerationConfig(temperature=1.0, logprobs=14, max_new_tokens=1, do_sample=True)
263
+ )
264
+ probs = np.array([[get_prob(r.logprobs[0], y) for y in idx_tokens] for r in responses])
265
+
266
+ N = probs.shape[1]
267
+ M = probs.shape[0]
268
+ idxs = np.tile(np.arange(1, N + 1), M).reshape(M, N)
269
+
270
+ expected_vals = (probs * idxs).sum(axis=1)
271
+ print(expected_vals)
272
+ # [6.66415229 1.84342025 1.01133205]
273
+ ```
274
+
275
+ # Evaluation
276
+
277
+ We perform an evaluation on 9 datasets from the [BEIR benchmark](https://github.com/beir-cellar/beir) that none of the evaluated models have been trained upon (to our knowledge).
278
+
279
+ * Arguana
280
+ * Dbpedia-entity
281
+ * Fiqa
282
+ * NFcorpus
283
+ * Scidocs
284
+ * Scifact
285
+ * Trec-covid-v2
286
+ * Vihealthqa
287
+ * Webis-touche2020
288
+
289
+ We evaluate on a subset of all queries (the first 250) to save evaluation time.
290
+
291
+ We find that our model performs similarly or better than many of the state-of-the-art reranker models in our evaluation, without compromising on inference speed.
292
+
293
+ We make our evaluation code and results available [on our Github](https://github.com/lightblue-tech/lb-reranker/blob/main/run_bier.ipynb).
294
+
295
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/xkNzCABFUmU7UmDXUduiz.png)
296
+
297
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/P-XCA3TGHqDSX8k6c4hCE.png)
298
+
299
+ As we can see, this reranker attains greater IR evaluation metrics compared to the two benchmarks we include for all positions apart from @1.
300
+
301
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b63f8ad57e02621dc93c8b/puhhWseBOcIyOEdW4L-B0.png)
302
+
303
+ We also show that our model is, on average, faster than the BGE reranker v2.
304
+
305
+ # License
306
+
307
+ We share this model under an Apache 2.0 license.
308
+
309
+ # Developed by
310
+
311
+ <a href="https://www.lightblue-tech.com">
312
+ <img src="https://www.lightblue-tech.com/wp-content/uploads/2023/08/color_%E6%A8%AA%E5%9E%8B-1536x469.png" alt="Lightblue technology logo" width="400"/>
313
+ </a>
314
+
315
+ This model was trained by Peter Devine ([ptrdvn](https://huggingface.co/ptrdvn)) for Lightblue