pattonma commited on
Commit
b7bb480
·
verified ·
1 Parent(s): a36238e

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1024,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,721 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:600
8
+ - loss:MatryoshkaLoss
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: Snowflake/snowflake-arctic-embed-l
11
+ widget:
12
+ - source_sentence: What is the date of the Gallup report regarding employer care for
13
+ employee wellbeing?
14
+ sentences:
15
+ - sense of purpose Defining work wellbeing
16
+ - What constitutes meaningful conversations between managers and employees? Gallup
17
+ found they include recognition and discussion about collaboration, goals, and
18
+ priorities, and the employee’s strengths. These conversations prevent employees
19
+ from feeling disconnected from the organization because managers stay in touch
20
+ with what each employee contributes and can then articulate how that work affects
21
+ the larger organization. The conversations ensure that expectations can be adjusted
22
+ as the business needs change and in what ways those changing expectations interact
23
+ with coworker roles.
24
+ - March 18, 2022 Gallup https://www.gallup.com/workplace/390776/percent-feel-employer-cares-wellbeing-plummets.aspx
25
+ Gallup World Headquarters, 901 F Street, Washington, D.C., 20001, U.S.A +1 202.715.3030
26
+ - source_sentence: What services does Evernorth Health Services provide?
27
+ sentences:
28
+ - 'Focusing on employee wellbeing and acknowledging the whole person. Since work
29
+ and life are blended for many, consider the demands of life inside and out of
30
+ the workplace. Consider career, social, financial, physical, and community wellbeing
31
+ impacts and resources.
32
+
33
+
34
+ Tailoring communication to reach their team where they are. Transparent and creative
35
+ omnichannel communication to employees and customers is more likely to reach and
36
+ resonate with a wide variety of people in many different work-life situations.'
37
+ - 'Investor Relations
38
+
39
+
40
+ Careers
41
+
42
+
43
+ Bottom FB - column 3
44
+
45
+
46
+ COVID Resource Center
47
+
48
+
49
+ Health and Wellness
50
+
51
+
52
+ Member Resources
53
+
54
+
55
+ Bottom FB - column 4
56
+
57
+
58
+ The Cigna Group
59
+
60
+
61
+ Cigna Healthcare
62
+
63
+
64
+ Evernorth Health Services
65
+
66
+
67
+ International'
68
+ - 1. The evolution of the disease burden. While McKinsey & Company employs many
69
+ medical experts and scientists, we are not a disease forecasting firm. We rely
70
+ on disease-burden forecasts globally and for the United States provided by IHME,
71
+ which maintains the most comprehensive database of the global disease burden and
72
+ for the United States as whole. Forecasts of the global and US disease burden
73
+ are inherently uncertain and health shocks such as the COVID-19 pandemic may affect
74
+ forecasts.
75
+ - source_sentence: How does the theme of "Wellbeing" relate to employees' perceptions
76
+ of their work-life balance?
77
+ sentences:
78
+ - "engagement as an extremely important priority—are effectively using metrics and\
79
+ \ shared some best practices for tying engagement to business performance. \n\
80
+ Copyright © 2013 Harvard Business School Publishing. All rights reserved.The Impact\
81
+ \ of \nEmployee Engagement on Performance\nhighlights\n71%\nof respondents rank\
82
+ \ \nemployee engagement as \nvery important to achieving \noverall organizational\
83
+ \ success.\n72%\nof respondents rank recognition \ngiven for high performers\
84
+ \ as \nhaving a significant impact on \nemployee engagement.\n24% \nof respondents\
85
+ \ say employees \nin their organization are \nhighly engaged."
86
+ - 'figure 10
87
+
88
+ Senior managers were far more likely to be optimistic than their middle-management
89
+ colleagues were in their perceptions of engagement levels. Since middle managers
90
+ are tasked with handling more day-to-day employee issues, their assessment is
91
+ likely the more accurate. This implies that in many firms senior man-agers may
92
+ need to take off the rose-colored glasses and take a closer look at the barriers
93
+ to engagement that may be present, and then find more effective ways of overcoming
94
+ them.'
95
+ - 'Gallup analysts identified individuals in its database who have declined in clarity
96
+ of expectations from 2020 to 2023. Among this group, across job types and work
97
+ locations, the largest areas of decline fit into five themes:
98
+
99
+
100
+ Feedback and Performance Focus
101
+
102
+
103
+ Received meaningful feedback in the last week
104
+
105
+
106
+ Performance managed to motivate outstanding performance
107
+
108
+
109
+ Manager keeps me informed on what is going on
110
+
111
+
112
+ Pride in quality of products/services
113
+
114
+
115
+ Freedom to make decisions needed to do my job well
116
+
117
+
118
+ Goals/Priorities
119
+
120
+
121
+ Manager includes me in goal setting
122
+
123
+
124
+ Feel prepared to do my job
125
+
126
+
127
+ Wellbeing
128
+
129
+
130
+ Organization cares about my wellbeing
131
+
132
+
133
+ Able to maintain a healthy balance between work and personal life
134
+
135
+
136
+ Team
137
+
138
+
139
+ Feel like part of the team'
140
+ - source_sentence: What impact does having one meaningful conversation per week with
141
+ each team member have on high-performance relationships according to Gallup?
142
+ sentences:
143
+ - 'Fewer than one in four U.S. employees feel strongly that their organization cares
144
+ about their wellbeing -- the lowest percentage in nearly a decade.
145
+
146
+
147
+ This finding has significant implications, as work and life have never been more
148
+ blended and employee wellbeing matters more than ever-- to employees and the resiliency
149
+ of organizations. The discovery is based on a random sample of 15,001 full and
150
+ part-time U.S. employees who were surveyed in February 2022.'
151
+ - has developed an open-access dashboard for more than 80 measures at the county,
152
+ state, and national levels. This data has highlighted, for example, the disproportionate
153
+ impact of COVID-19 on communities of color as well as physical health and behavioral
154
+ health vulnerability to COVID-19.
155
+ - Gallup finds that a manager having one meaningful conversation per week with each
156
+ team member develops high-performance relationships more than any other leadership
157
+ activity. Gallup analytics have found managers can be quickly upskilled to have
158
+ these ongoing strengths-based conversations that bring purpose and clear expectations
159
+ to work, which is now deteriorating in U.S. organizations.
160
+ - source_sentence: How does Alexis Krivkovich's perspective as a mother influence
161
+ her optimism about the future of women in the workplace?
162
+ sentences:
163
+ - 'Author(s)
164
+
165
+
166
+ Jim Harter, Ph.D., is Chief Scientist, Workplace for Gallup and bestselling author
167
+ of Culture Shock, Wellbeing at Work, It''s the Manager, 12: The Elements of Great
168
+ Managing and Wellbeing: The Five Essential Elements. His research is also featured
169
+ in the groundbreaking New York Times bestseller, First, Break All the Rules. Dr.
170
+ Harter has led more than 1,000 studies of workplace effectiveness, including the
171
+ largest ongoing meta-analysis of human potential and business-unit performance.
172
+ His work has also appeared in many publications, including Harvard Business Review,
173
+ The New York Times and The Wall Street Journal, and in many prominent academic
174
+ journals.
175
+
176
+
177
+ Sangeeta Agrawal contributed analysis to this article.
178
+
179
+
180
+ Survey Methods'
181
+ - "Learn more about the \nWork Happiness Score at: \ngo.indeed.com/happiness"
182
+ - 'Lucia Rahilly: Sometimes, I feel that we’ve been talking about these issues since
183
+ I was in college, and that can feel discouraging. What are you most optimistic
184
+ about going into 2022, coming out of this Women in the Workplace report?
185
+
186
+
187
+ Alexis Krivkovich: I’m most optimistic about the fact that we’re having an honest
188
+ conversation, and now with a real fact base. We’re not talking about these things
189
+ as perception but as real and measured experiences that companies can’t hide from—and
190
+ they don’t want to.
191
+
192
+
193
+ As a mother of three young daughters, it gives me real hope because I’ve been
194
+ thinking about this question for 20 years. But in 20 years, when they’re fully
195
+ in the workplace, maybe we’ll have a totally different paradigm.'
196
+ pipeline_tag: sentence-similarity
197
+ library_name: sentence-transformers
198
+ metrics:
199
+ - cosine_accuracy@1
200
+ - cosine_accuracy@3
201
+ - cosine_accuracy@5
202
+ - cosine_accuracy@10
203
+ - cosine_precision@1
204
+ - cosine_precision@3
205
+ - cosine_precision@5
206
+ - cosine_precision@10
207
+ - cosine_recall@1
208
+ - cosine_recall@3
209
+ - cosine_recall@5
210
+ - cosine_recall@10
211
+ - cosine_ndcg@10
212
+ - cosine_mrr@10
213
+ - cosine_map@100
214
+ - dot_accuracy@1
215
+ - dot_accuracy@3
216
+ - dot_accuracy@5
217
+ - dot_accuracy@10
218
+ - dot_precision@1
219
+ - dot_precision@3
220
+ - dot_precision@5
221
+ - dot_precision@10
222
+ - dot_recall@1
223
+ - dot_recall@3
224
+ - dot_recall@5
225
+ - dot_recall@10
226
+ - dot_ndcg@10
227
+ - dot_mrr@10
228
+ - dot_map@100
229
+ model-index:
230
+ - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
231
+ results:
232
+ - task:
233
+ type: information-retrieval
234
+ name: Information Retrieval
235
+ dataset:
236
+ name: Unknown
237
+ type: unknown
238
+ metrics:
239
+ - type: cosine_accuracy@1
240
+ value: 0.81
241
+ name: Cosine Accuracy@1
242
+ - type: cosine_accuracy@3
243
+ value: 0.93
244
+ name: Cosine Accuracy@3
245
+ - type: cosine_accuracy@5
246
+ value: 0.97
247
+ name: Cosine Accuracy@5
248
+ - type: cosine_accuracy@10
249
+ value: 0.98
250
+ name: Cosine Accuracy@10
251
+ - type: cosine_precision@1
252
+ value: 0.81
253
+ name: Cosine Precision@1
254
+ - type: cosine_precision@3
255
+ value: 0.30999999999999994
256
+ name: Cosine Precision@3
257
+ - type: cosine_precision@5
258
+ value: 0.19399999999999995
259
+ name: Cosine Precision@5
260
+ - type: cosine_precision@10
261
+ value: 0.09799999999999998
262
+ name: Cosine Precision@10
263
+ - type: cosine_recall@1
264
+ value: 0.81
265
+ name: Cosine Recall@1
266
+ - type: cosine_recall@3
267
+ value: 0.93
268
+ name: Cosine Recall@3
269
+ - type: cosine_recall@5
270
+ value: 0.97
271
+ name: Cosine Recall@5
272
+ - type: cosine_recall@10
273
+ value: 0.98
274
+ name: Cosine Recall@10
275
+ - type: cosine_ndcg@10
276
+ value: 0.9036533710134148
277
+ name: Cosine Ndcg@10
278
+ - type: cosine_mrr@10
279
+ value: 0.8780952380952383
280
+ name: Cosine Mrr@10
281
+ - type: cosine_map@100
282
+ value: 0.8798376623376624
283
+ name: Cosine Map@100
284
+ - type: dot_accuracy@1
285
+ value: 0.81
286
+ name: Dot Accuracy@1
287
+ - type: dot_accuracy@3
288
+ value: 0.93
289
+ name: Dot Accuracy@3
290
+ - type: dot_accuracy@5
291
+ value: 0.97
292
+ name: Dot Accuracy@5
293
+ - type: dot_accuracy@10
294
+ value: 0.98
295
+ name: Dot Accuracy@10
296
+ - type: dot_precision@1
297
+ value: 0.81
298
+ name: Dot Precision@1
299
+ - type: dot_precision@3
300
+ value: 0.30999999999999994
301
+ name: Dot Precision@3
302
+ - type: dot_precision@5
303
+ value: 0.19399999999999995
304
+ name: Dot Precision@5
305
+ - type: dot_precision@10
306
+ value: 0.09799999999999998
307
+ name: Dot Precision@10
308
+ - type: dot_recall@1
309
+ value: 0.81
310
+ name: Dot Recall@1
311
+ - type: dot_recall@3
312
+ value: 0.93
313
+ name: Dot Recall@3
314
+ - type: dot_recall@5
315
+ value: 0.97
316
+ name: Dot Recall@5
317
+ - type: dot_recall@10
318
+ value: 0.98
319
+ name: Dot Recall@10
320
+ - type: dot_ndcg@10
321
+ value: 0.9036533710134148
322
+ name: Dot Ndcg@10
323
+ - type: dot_mrr@10
324
+ value: 0.8780952380952383
325
+ name: Dot Mrr@10
326
+ - type: dot_map@100
327
+ value: 0.8798376623376624
328
+ name: Dot Map@100
329
+ ---
330
+
331
+ # SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
332
+
333
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Snowflake/snowflake-arctic-embed-l](https://huggingface.co/Snowflake/snowflake-arctic-embed-l). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
334
+
335
+ ## Model Details
336
+
337
+ ### Model Description
338
+ - **Model Type:** Sentence Transformer
339
+ - **Base model:** [Snowflake/snowflake-arctic-embed-l](https://huggingface.co/Snowflake/snowflake-arctic-embed-l) <!-- at revision 9a9e5834d2e89cdd8bb72b64111dde496e4fe78c -->
340
+ - **Maximum Sequence Length:** 512 tokens
341
+ - **Output Dimensionality:** 1024 tokens
342
+ - **Similarity Function:** Cosine Similarity
343
+ <!-- - **Training Dataset:** Unknown -->
344
+ <!-- - **Language:** Unknown -->
345
+ <!-- - **License:** Unknown -->
346
+
347
+ ### Model Sources
348
+
349
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
350
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
351
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
352
+
353
+ ### Full Model Architecture
354
+
355
+ ```
356
+ SentenceTransformer(
357
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
358
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
359
+ (2): Normalize()
360
+ )
361
+ ```
362
+
363
+ ## Usage
364
+
365
+ ### Direct Usage (Sentence Transformers)
366
+
367
+ First install the Sentence Transformers library:
368
+
369
+ ```bash
370
+ pip install -U sentence-transformers
371
+ ```
372
+
373
+ Then you can load this model and run inference.
374
+ ```python
375
+ from sentence_transformers import SentenceTransformer
376
+
377
+ # Download from the 🤗 Hub
378
+ model = SentenceTransformer("CoExperiences/snowflake-l-marketing-tuned")
379
+ # Run inference
380
+ sentences = [
381
+ "How does Alexis Krivkovich's perspective as a mother influence her optimism about the future of women in the workplace?",
382
+ 'Lucia Rahilly: Sometimes, I feel that we’ve been talking about these issues since I was in college, and that can feel discouraging. What are you most optimistic about going into 2022, coming out of this Women in the Workplace report?\n\nAlexis Krivkovich: I’m most optimistic about the fact that we’re having an honest conversation, and now with a real fact base. We’re not talking about these things as perception but as real and measured experiences that companies can’t hide from—and they don’t want to.\n\nAs a mother of three young daughters, it gives me real hope because I’ve been thinking about this question for 20 years. But in 20 years, when they’re fully in the workplace, maybe we’ll have a totally different paradigm.',
383
+ 'Learn more about the \nWork Happiness Score at: \ngo.indeed.com/happiness',
384
+ ]
385
+ embeddings = model.encode(sentences)
386
+ print(embeddings.shape)
387
+ # [3, 1024]
388
+
389
+ # Get the similarity scores for the embeddings
390
+ similarities = model.similarity(embeddings, embeddings)
391
+ print(similarities.shape)
392
+ # [3, 3]
393
+ ```
394
+
395
+ <!--
396
+ ### Direct Usage (Transformers)
397
+
398
+ <details><summary>Click to see the direct usage in Transformers</summary>
399
+
400
+ </details>
401
+ -->
402
+
403
+ <!--
404
+ ### Downstream Usage (Sentence Transformers)
405
+
406
+ You can finetune this model on your own dataset.
407
+
408
+ <details><summary>Click to expand</summary>
409
+
410
+ </details>
411
+ -->
412
+
413
+ <!--
414
+ ### Out-of-Scope Use
415
+
416
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
417
+ -->
418
+
419
+ ## Evaluation
420
+
421
+ ### Metrics
422
+
423
+ #### Information Retrieval
424
+
425
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
426
+
427
+ | Metric | Value |
428
+ |:--------------------|:-----------|
429
+ | cosine_accuracy@1 | 0.81 |
430
+ | cosine_accuracy@3 | 0.93 |
431
+ | cosine_accuracy@5 | 0.97 |
432
+ | cosine_accuracy@10 | 0.98 |
433
+ | cosine_precision@1 | 0.81 |
434
+ | cosine_precision@3 | 0.31 |
435
+ | cosine_precision@5 | 0.194 |
436
+ | cosine_precision@10 | 0.098 |
437
+ | cosine_recall@1 | 0.81 |
438
+ | cosine_recall@3 | 0.93 |
439
+ | cosine_recall@5 | 0.97 |
440
+ | cosine_recall@10 | 0.98 |
441
+ | cosine_ndcg@10 | 0.9037 |
442
+ | cosine_mrr@10 | 0.8781 |
443
+ | **cosine_map@100** | **0.8798** |
444
+ | dot_accuracy@1 | 0.81 |
445
+ | dot_accuracy@3 | 0.93 |
446
+ | dot_accuracy@5 | 0.97 |
447
+ | dot_accuracy@10 | 0.98 |
448
+ | dot_precision@1 | 0.81 |
449
+ | dot_precision@3 | 0.31 |
450
+ | dot_precision@5 | 0.194 |
451
+ | dot_precision@10 | 0.098 |
452
+ | dot_recall@1 | 0.81 |
453
+ | dot_recall@3 | 0.93 |
454
+ | dot_recall@5 | 0.97 |
455
+ | dot_recall@10 | 0.98 |
456
+ | dot_ndcg@10 | 0.9037 |
457
+ | dot_mrr@10 | 0.8781 |
458
+ | dot_map@100 | 0.8798 |
459
+
460
+ <!--
461
+ ## Bias, Risks and Limitations
462
+
463
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
464
+ -->
465
+
466
+ <!--
467
+ ### Recommendations
468
+
469
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
470
+ -->
471
+
472
+ ## Training Details
473
+
474
+ ### Training Dataset
475
+
476
+ #### Unnamed Dataset
477
+
478
+
479
+ * Size: 600 training samples
480
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
481
+ * Approximate statistics based on the first 600 samples:
482
+ | | sentence_0 | sentence_1 |
483
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
484
+ | type | string | string |
485
+ | details | <ul><li>min: 9 tokens</li><li>mean: 20.08 tokens</li><li>max: 39 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 110.85 tokens</li><li>max: 187 tokens</li></ul> |
486
+ * Samples:
487
+ | sentence_0 | sentence_1 |
488
+ |:------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
489
+ | <code>What significant change occurred in employees' perceptions of their employer's care for their wellbeing during the pandemic?</code> | <code>Workplace<br><br>Percent Who Feel Employer Cares About Their Wellbeing Plummets<br><br>Share on LinkedIn<br><br>Share on Twitter<br><br>Share on Facebook<br><br>Share via Email<br><br>Print<br><br>Share on LinkedIn<br><br>Share on Twitter<br><br>Share on Facebook<br><br>Share via Email<br><br>Print<br><br>Workplace<br><br>March 18, 2022<br><br>Percent Who Feel Employer Cares About Their Wellbeing Plummets<br><br>by Jim Harter<br><br>Story Highlights<br><br>Employees' perceptions of their organization caring about their wellbeing drops<br><br>During the onset of the pandemic, employees felt employers had more care and concern<br><br>Employees who feel their employer cares about their wellbeing are 69% less likely to actively search for a job</code> |
490
+ | <code>How does feeling cared for by an employer impact employees' job search behavior?</code> | <code>Workplace<br><br>Percent Who Feel Employer Cares About Their Wellbeing Plummets<br><br>Share on LinkedIn<br><br>Share on Twitter<br><br>Share on Facebook<br><br>Share via Email<br><br>Print<br><br>Share on LinkedIn<br><br>Share on Twitter<br><br>Share on Facebook<br><br>Share via Email<br><br>Print<br><br>Workplace<br><br>March 18, 2022<br><br>Percent Who Feel Employer Cares About Their Wellbeing Plummets<br><br>by Jim Harter<br><br>Story Highlights<br><br>Employees' perceptions of their organization caring about their wellbeing drops<br><br>During the onset of the pandemic, employees felt employers had more care and concern<br><br>Employees who feel their employer cares about their wellbeing are 69% less likely to actively search for a job</code> |
491
+ | <code>What percentage of U.S. employees feel strongly that their organization cares about their wellbeing?</code> | <code>Fewer than one in four U.S. employees feel strongly that their organization cares about their wellbeing -- the lowest percentage in nearly a decade.<br><br>This finding has significant implications, as work and life have never been more blended and employee wellbeing matters more than ever-- to employees and the resiliency of organizations. The discovery is based on a random sample of 15,001 full and part-time U.S. employees who were surveyed in February 2022.</code> |
492
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
493
+ ```json
494
+ {
495
+ "loss": "MultipleNegativesRankingLoss",
496
+ "matryoshka_dims": [
497
+ 1024,
498
+ 768,
499
+ 512,
500
+ 256,
501
+ 128,
502
+ 64
503
+ ],
504
+ "matryoshka_weights": [
505
+ 1,
506
+ 1,
507
+ 1,
508
+ 1,
509
+ 1,
510
+ 1
511
+ ],
512
+ "n_dims_per_step": -1
513
+ }
514
+ ```
515
+
516
+ ### Training Hyperparameters
517
+ #### Non-Default Hyperparameters
518
+
519
+ - `eval_strategy`: steps
520
+ - `per_device_train_batch_size`: 20
521
+ - `per_device_eval_batch_size`: 20
522
+ - `num_train_epochs`: 5
523
+ - `multi_dataset_batch_sampler`: round_robin
524
+
525
+ #### All Hyperparameters
526
+ <details><summary>Click to expand</summary>
527
+
528
+ - `overwrite_output_dir`: False
529
+ - `do_predict`: False
530
+ - `eval_strategy`: steps
531
+ - `prediction_loss_only`: True
532
+ - `per_device_train_batch_size`: 20
533
+ - `per_device_eval_batch_size`: 20
534
+ - `per_gpu_train_batch_size`: None
535
+ - `per_gpu_eval_batch_size`: None
536
+ - `gradient_accumulation_steps`: 1
537
+ - `eval_accumulation_steps`: None
538
+ - `torch_empty_cache_steps`: None
539
+ - `learning_rate`: 5e-05
540
+ - `weight_decay`: 0.0
541
+ - `adam_beta1`: 0.9
542
+ - `adam_beta2`: 0.999
543
+ - `adam_epsilon`: 1e-08
544
+ - `max_grad_norm`: 1
545
+ - `num_train_epochs`: 5
546
+ - `max_steps`: -1
547
+ - `lr_scheduler_type`: linear
548
+ - `lr_scheduler_kwargs`: {}
549
+ - `warmup_ratio`: 0.0
550
+ - `warmup_steps`: 0
551
+ - `log_level`: passive
552
+ - `log_level_replica`: warning
553
+ - `log_on_each_node`: True
554
+ - `logging_nan_inf_filter`: True
555
+ - `save_safetensors`: True
556
+ - `save_on_each_node`: False
557
+ - `save_only_model`: False
558
+ - `restore_callback_states_from_checkpoint`: False
559
+ - `no_cuda`: False
560
+ - `use_cpu`: False
561
+ - `use_mps_device`: False
562
+ - `seed`: 42
563
+ - `data_seed`: None
564
+ - `jit_mode_eval`: False
565
+ - `use_ipex`: False
566
+ - `bf16`: False
567
+ - `fp16`: False
568
+ - `fp16_opt_level`: O1
569
+ - `half_precision_backend`: auto
570
+ - `bf16_full_eval`: False
571
+ - `fp16_full_eval`: False
572
+ - `tf32`: None
573
+ - `local_rank`: 0
574
+ - `ddp_backend`: None
575
+ - `tpu_num_cores`: None
576
+ - `tpu_metrics_debug`: False
577
+ - `debug`: []
578
+ - `dataloader_drop_last`: False
579
+ - `dataloader_num_workers`: 0
580
+ - `dataloader_prefetch_factor`: None
581
+ - `past_index`: -1
582
+ - `disable_tqdm`: False
583
+ - `remove_unused_columns`: True
584
+ - `label_names`: None
585
+ - `load_best_model_at_end`: False
586
+ - `ignore_data_skip`: False
587
+ - `fsdp`: []
588
+ - `fsdp_min_num_params`: 0
589
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
590
+ - `fsdp_transformer_layer_cls_to_wrap`: None
591
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
592
+ - `deepspeed`: None
593
+ - `label_smoothing_factor`: 0.0
594
+ - `optim`: adamw_torch
595
+ - `optim_args`: None
596
+ - `adafactor`: False
597
+ - `group_by_length`: False
598
+ - `length_column_name`: length
599
+ - `ddp_find_unused_parameters`: None
600
+ - `ddp_bucket_cap_mb`: None
601
+ - `ddp_broadcast_buffers`: False
602
+ - `dataloader_pin_memory`: True
603
+ - `dataloader_persistent_workers`: False
604
+ - `skip_memory_metrics`: True
605
+ - `use_legacy_prediction_loop`: False
606
+ - `push_to_hub`: False
607
+ - `resume_from_checkpoint`: None
608
+ - `hub_model_id`: None
609
+ - `hub_strategy`: every_save
610
+ - `hub_private_repo`: False
611
+ - `hub_always_push`: False
612
+ - `gradient_checkpointing`: False
613
+ - `gradient_checkpointing_kwargs`: None
614
+ - `include_inputs_for_metrics`: False
615
+ - `eval_do_concat_batches`: True
616
+ - `fp16_backend`: auto
617
+ - `push_to_hub_model_id`: None
618
+ - `push_to_hub_organization`: None
619
+ - `mp_parameters`:
620
+ - `auto_find_batch_size`: False
621
+ - `full_determinism`: False
622
+ - `torchdynamo`: None
623
+ - `ray_scope`: last
624
+ - `ddp_timeout`: 1800
625
+ - `torch_compile`: False
626
+ - `torch_compile_backend`: None
627
+ - `torch_compile_mode`: None
628
+ - `dispatch_batches`: None
629
+ - `split_batches`: None
630
+ - `include_tokens_per_second`: False
631
+ - `include_num_input_tokens_seen`: False
632
+ - `neftune_noise_alpha`: None
633
+ - `optim_target_modules`: None
634
+ - `batch_eval_metrics`: False
635
+ - `eval_on_start`: False
636
+ - `use_liger_kernel`: False
637
+ - `eval_use_gather_object`: False
638
+ - `batch_sampler`: batch_sampler
639
+ - `multi_dataset_batch_sampler`: round_robin
640
+
641
+ </details>
642
+
643
+ ### Training Logs
644
+ | Epoch | Step | cosine_map@100 |
645
+ |:------:|:----:|:--------------:|
646
+ | 1.0 | 30 | 0.8782 |
647
+ | 1.6667 | 50 | 0.8878 |
648
+ | 2.0 | 60 | 0.8854 |
649
+ | 3.0 | 90 | 0.8853 |
650
+ | 3.3333 | 100 | 0.8845 |
651
+ | 4.0 | 120 | 0.8793 |
652
+ | 5.0 | 150 | 0.8798 |
653
+
654
+
655
+ ### Framework Versions
656
+ - Python: 3.10.12
657
+ - Sentence Transformers: 3.2.0
658
+ - Transformers: 4.45.2
659
+ - PyTorch: 2.5.0+cu124
660
+ - Accelerate: 0.34.2
661
+ - Datasets: 3.0.1
662
+ - Tokenizers: 0.20.1
663
+
664
+ ## Citation
665
+
666
+ ### BibTeX
667
+
668
+ #### Sentence Transformers
669
+ ```bibtex
670
+ @inproceedings{reimers-2019-sentence-bert,
671
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
672
+ author = "Reimers, Nils and Gurevych, Iryna",
673
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
674
+ month = "11",
675
+ year = "2019",
676
+ publisher = "Association for Computational Linguistics",
677
+ url = "https://arxiv.org/abs/1908.10084",
678
+ }
679
+ ```
680
+
681
+ #### MatryoshkaLoss
682
+ ```bibtex
683
+ @misc{kusupati2024matryoshka,
684
+ title={Matryoshka Representation Learning},
685
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
686
+ year={2024},
687
+ eprint={2205.13147},
688
+ archivePrefix={arXiv},
689
+ primaryClass={cs.LG}
690
+ }
691
+ ```
692
+
693
+ #### MultipleNegativesRankingLoss
694
+ ```bibtex
695
+ @misc{henderson2017efficient,
696
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
697
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
698
+ year={2017},
699
+ eprint={1705.00652},
700
+ archivePrefix={arXiv},
701
+ primaryClass={cs.CL}
702
+ }
703
+ ```
704
+
705
+ <!--
706
+ ## Glossary
707
+
708
+ *Clearly define terms in order to be accessible across audiences.*
709
+ -->
710
+
711
+ <!--
712
+ ## Model Card Authors
713
+
714
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
715
+ -->
716
+
717
+ <!--
718
+ ## Model Card Contact
719
+
720
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
721
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "Snowflake/snowflake-arctic-embed-l",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 1024,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 4096,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 16,
17
+ "num_hidden_layers": 24,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.45.2",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 30522
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.2.0",
4
+ "transformers": "4.45.2",
5
+ "pytorch": "2.5.0+cu124"
6
+ },
7
+ "prompts": {
8
+ "query": "Represent this sentence for searching relevant passages: "
9
+ },
10
+ "default_prompt_name": null,
11
+ "similarity_fn_name": null
12
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:48bcf8d6bf186737b484b67530fe6e44584e5c969b1e502f5ab2010e00d3383e
3
+ size 1336413848
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": true,
47
+ "mask_token": "[MASK]",
48
+ "max_length": 512,
49
+ "model_max_length": 512,
50
+ "pad_to_multiple_of": null,
51
+ "pad_token": "[PAD]",
52
+ "pad_token_type_id": 0,
53
+ "padding_side": "right",
54
+ "sep_token": "[SEP]",
55
+ "stride": 0,
56
+ "strip_accents": null,
57
+ "tokenize_chinese_chars": true,
58
+ "tokenizer_class": "BertTokenizer",
59
+ "truncation_side": "right",
60
+ "truncation_strategy": "longest_first",
61
+ "unk_token": "[UNK]"
62
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff