JoshELambert commited on
Commit
e4db5cc
·
verified ·
1 Parent(s): 546c232

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,301 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - setfit
4
+ - sentence-transformers
5
+ - text-classification
6
+ - generated_from_setfit_trainer
7
+ widget:
8
+ - text: State and federal lawmakers are promising to improve conditions for hundreds
9
+ of foreign fishermen working in Hawaii's commercial fleet, and at least one company
10
+ has already stopped buying fish from the boats following an Associated Press investigation
11
+ that found the men have been confined to vessels for years without basic labor
12
+ protections.Whole Foods halted buying seafood caught by foreign crew until it's
13
+ clear the men are treated fairly. On Sunday, the Hawaii Seafood Council said that
14
+ starting Oct. 1, the Honolulu Fish Auction will sell fish only from boats that
15
+ have adopted a new, standardized contract aimed at assuring no forced labor exists
16
+ on board.The AP report found commercial fishing boats in Honolulu were crewed
17
+ by men from impoverished Southeast Asia and Pacific Island nations who catch prized
18
+ swordfish, ahi tuna and other seafood sold at markets and upscale restaurants
19
+ across the country. A legal loophole allows them to work on the American-owned,
20
+ American-flagged boats without visas as long as they don't set foot on shore.
21
+ The system is facilitated by the U.S. Coast Guard and Customs and Border Protection.
22
+ While many men appreciate the jobs, which pay better than they could get back
23
+ home, the report revealed instances of human trafficking, tuberculosis and food
24
+ shortages. It also found some fishermen being forced to defecate
25
+ - text:  Trinidad and Tobago is a destination, transit, and possible source country
26
+ for adults and children subjected to sex trafficking and forced labor Women and
27
+ girls from the Dominican Republic, Guyana, Venezuela, and Colombia are subjected
28
+ to sex trafficking in brothels and clubs, with young women from Venezuela especially
29
+ vulnerable. Economic migrants from the Caribbean region, especially Guyana, and
30
+ from Asia are vulnerable to forced labor Victims have been subjected to forced
31
+ labor in domestic service and the retail sector Immigration officials note an
32
+ increase in international criminal organizations' involvement in trafficking,
33
+ and NGOs report young boys are coerced to sell drugs and guns. In a break with
34
+ common practice, some traffickers have recently allowed victims to keep their
35
+ passports, removing a common indicator of human trafficking in an attempt to avoid
36
+ detection. Many other traffickers continue to confiscate victims' passports and
37
+ travel documents. Economic migrants who lack legal status may be exposed to various
38
+ forms of exploitation and abuse indicative of trafficking. Trinidad and Tobago
39
+ experiences a steady flow of vessels transiting its territorial waters, some of
40
+ which may be engaged in illicit and illegal activities, including forced labor
41
+ in the fishing industry. Complicity by police and immigration officials in trafficking
42
+ crimes impeded anti-trafficking efforts. Law enforcement and civil society reported
43
+ - text: icked onto fishing boats. In early 2013, an organization that assists victims
44
+ in Cambodia assessed this form of trafficking was rising. Cambodian and Burmese
45
+ workers are increasingly unwilling to work in the Thai fishing industry due to
46
+ dangerous work conditions and isolation, which makes them more vulnerable to exploitation;
47
+ the Government of Thailand announced plans during the year to import Bangladeshi
48
+ workers to ill the labor shortage this has caused. During the year, therewere
49
+ reports that some Rohingya asylum seekers from Burma were smuggled into Thailand
50
+ en route to Malaysia and ultimately sold into forced labor, allegedly with the
51
+ assistance of Thai civilian and military officials.Observers noted that traffickers
52
+ (including labor brokers) who bring foreign victims into Thailand generally work
53
+ as individuals or in unorganized groups, while those who exploit Thai victims
54
+ abroad tend to be more organized. Labor brokers, largely unregulated, serve as
55
+ intermediaries between jobseekers and employers; some facilitate or engage in
56
+ human trafficking. Brokers are reportedly of both Thai and foreign origin and
57
+ work in networks, collaborating with employers and attimes with corrupt law enforcement
58
+ officials. Foreign migrants, members of ethnic minorities, and stateless persons
59
+ in Thailand are at the greatest risk of being trafficked, and they experience
60
+ the withholding of travel documents
61
+ - text: ' in their own villages by debt bondage or born ito slavery, work in construction,
62
+ textiles, brick-making, mines, fish and prawn processing and hospitality.Russia490,000
63
+ - 540,000Migrant workers endure extortion and physical abuse; anecdotal evidence
64
+ suggests that forced labour camps still operate in Siberia.China2,800,000 - 3,100,000Severe
65
+ forced labour in brick kilns in the north; forced labour in modern industries
66
+ including fashion and computer supply chains.Myanmar360,000 - 400,000Slavery includes
67
+ reports of deceptive recruitment of women for sale as brides in China, forced
68
+ labour of adults on plantations and in industry and forced labour of children
69
+ in tea shops, home industries and as beggars.Thailand450,000 - 500,000An explosion
70
+ in global demand for seafood has led to an increased need for cheap migrant labour,
71
+ including on fishing boats. High numbers of children are exploited, particulary
72
+ those from ethnic minorities and hill tribes.SOURCE: THE GLOBAL SLAVERY INDEX
73
+ 2013'
74
+ - text: The number of Cambodians recently found in Indonesia after being trafficked
75
+ onto Thai fishing vessels has risen to 230, the Ministry of Foreign Affairs said
76
+ in a press statement released yesterday. The ministry confirmed that, following
77
+ an investigation by Indonesian authorities along with Cambodian Embassy personnel,
78
+ an additional 31 fishermen were rescued from Ambon Island over the last week,
79
+ adding to the 199 discovered last Friday. The men were reportedly trafficked to
80
+ work on the Thai vessels for years before Indonesian authorities managed to rescue
81
+ them.Ministry spokesman Koy Kuong said Cambodian officials visited the island
82
+ from May 30 to June 3 to check on the men's conditions, adding that the owner
83
+ of the Thai fishing boats have paid the workers their salary and have agreed to
84
+ pay for a charter flight from Ambon to Phnom Penh.They have agreed in principle,
85
+ and now they are processing the procedure to ensure that these people to return
86
+ sometime this month, he said.International Organisation for Migration project
87
+ manager Paul Dillon said IOM staff had joined a small mission from the Ministry
88
+ of Fisheries and Oceans [Thursday] at the Indonesian government's request on a
89
+ fact-finding mission .â(EURO)°.â(EURO)°. to identify possible
90
+ metrics:
91
+ - accuracy
92
+ pipeline_tag: text-classification
93
+ library_name: setfit
94
+ inference: true
95
+ base_model: sentence-transformers/paraphrase-mpnet-base-v2
96
+ model-index:
97
+ - name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
98
+ results:
99
+ - task:
100
+ type: text-classification
101
+ name: Text Classification
102
+ dataset:
103
+ name: Unknown
104
+ type: unknown
105
+ split: test
106
+ metrics:
107
+ - type: accuracy
108
+ value: 1.0
109
+ name: Accuracy
110
+ ---
111
+
112
+ # SetFit with sentence-transformers/paraphrase-mpnet-base-v2
113
+
114
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
115
+
116
+ The model has been trained using an efficient few-shot learning technique that involves:
117
+
118
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
119
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
120
+
121
+ ## Model Details
122
+
123
+ ### Model Description
124
+ - **Model Type:** SetFit
125
+ - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
126
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
127
+ - **Maximum Sequence Length:** 512 tokens
128
+ - **Number of Classes:** 2 classes
129
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
130
+ <!-- - **Language:** Unknown -->
131
+ <!-- - **License:** Unknown -->
132
+
133
+ ### Model Sources
134
+
135
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
136
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
137
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
138
+
139
+ ### Model Labels
140
+ | Label | Examples |
141
+ |:------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
142
+ | 1 | <ul><li>' due diligence and weak monitoring control, surveillance and enforcement systems by coastal, flag, and port states," it noted.Human rights abuses are also driven by an array of other factors, it added. The Thai fishing industry is structurally dependent on unskilled workers, a result of a failure to invest in technology to increase labour productivity as well as an abundance of cheap migrant workers from the less-developed neighbours.At the same time, vessel operators face a chronic shortage of workers -- a deficit estimated by the National Fishing Association of Thailand (NFAT) to be as high as 50,000."Combined with economic pressure arising from the degradation of marine resources in the Thai exclusive economic zone (EEZ), these factors shape the prevalence of labour abuses and the use of trafficking, forced and bonded labour in the Thai fishing industry."As migrant workers from Myanmar, Cambodia, Laos and undeveloped rural regions of Thailand, particularly the Northeast, are trafficked through Indonesia, both countries must collaborate further to address abuses, said Mark Dia, Greenpeace\'s Regional Oceans Campaign coordinator for Southeast Asia.Greenpeace has worked with the Indonesian government on the matter while the Thai military government has already taken action to address chronic problems facing the fishing industry for many years'</li><li>' for all forms of trafficking, including forced and bonded labour, respecting due process.Forced labour constitutes India\'s largest trafficking problem; men, women, and children in debt bondage- sometimes inherited from previous generations- are forced to work in brick kilns, rice mills, agriculture, and embroidery units, it said.The majority of India\'s trafficking problem is internal, and those from the most disadvantaged social strata- Dalits, members of tribal communities, religious minorities, and women and girls from excluded groups- are most vulnerable, it added."Within India, some are subjected to forced labour in sectors such as construction, steel, and textile industries; wire manufacturing for underground cables; biscuit factories; pickling; floriculture; fish farms; and ship breaking," said the State Department.Thousands of unregulated work placement agencies reportedly lure adults and children under false promises of employment for sex trafficking or forced labour, including domestic servitude.In addition to bonded labour, some children are subjected to forced labour as factory and agricultural workers, domestic servants, and beggars. Begging ringleaders sometimes maim children to earn more money."Some NGOs and media report girls are sold and forced to conceive and deliver babies for sale. Conditions amounting to'</li><li>" supplies.\xa0Americans buying Hawaiian seafood are almost certainly eating fish caught by one of these workers.'We want the same standards as the other workers in America, but we are just small people working there,' said fisherman Syamsul Maarif, who didn't get paid for four months.\xa0He was sent back to his Indonesian village after nearly dying at sea when his Hawaiian boat sank earlier this year.Because they have no visas, the men can't fly into Hawaii, so they're brought by boat.\xa0And since they are not technically in the country, they're at the mercy of their American captains on American-flagged, American-owned vessels, catching choice swordfish and ahi tuna that can fetch more than $1,000 apiece.\xa0The entire system contradicts other state and federal laws, yet operates with the blessing of U.S. officials and law enforcement.'People say these fishermen can't leave their boats, they're like captives,' said U.S. Attorney Florence Nakakuni in Hawaii.\xa0'But they don't have visas, so they can't leave their boat, really.'Each of the roughly 140 boats in the fleet docks about once every three weeks, occasionally at ports"</li></ul> |
143
+ | 0 | <ul><li>' forced marriage and other such exploitation. In a DW interview, Fiona David, executive director of global research at the Walk Free Foundation, calls on the countries in the region to step up their efforts to combat the problem and put in place the mechanisms that would require businesses to focus on the issues of slavery and forced labor throughout their supply chains. DW: According to the 2016 Global Slavery Index, nearly two-thirds of modern slaves are living in Asian countries. What are the reasons behind the high prevalence of this phenomenon in the region? Fiona David: The Asia-Pacific is the most populous region in the world, and it is also well integrated into the global supply chains. We do estimate that about two-thirds of the nearly 46 million people trapped in slavery are in Asia. And we see all forms of modern slavery in the region, such as forced labor in brick kilns, child beggars in Afghanistan and India, bonded labor in the agricultural as well as garment sectors. Given its population size and integration into global value chains, the Asia-Pacific is a region where a lot of low cost labor is made available to produce the goods and services that we all consume. What kind of living and working conditions do these people find themselves in? They experience miserable'</li><li>" going to cook those up for\xa0\xa0\xa0\xa0\xa0 dinner?' I said, no, that's bait for fishing'. He thought we were\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 going to cook and eat those frozen prawns!\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0Signs to look for when fish are biting, as well as learning about how to catch and name fish, are among other activities, as is counting and comparing the yield, talking about how many fish should be cooked on the spot, or taken home to family members. In addition, there is the preparation and cooking of the fish once caught: gutting and singeing it when required, making a coal fire, building a bush oven, timing the bake, turning the fish, and finding leaves and bark from local trees to plate cooked fish (Toussaint 2010; Toussaint et al. 2005; Yu 2006). The eating and sharing of fish, enjoying its tastiness, or comparing the quality of one fish to another, marks the culmination of a good family day, a point I develop below. All of these activities, whether seen on their own as individual parts, or brought together as a whole, reveal their experiential importance to those families who are directly involved, as well"</li><li>"8 million in modern slaveryPakistan2,000,000 - 2,200,000Bonded labour affects men, women and children largely from rural areas who travel to cities to find work, and has been reported in many industries, primarily brick kilns, but also in agriculture, fisheries and mining.Ethiopia620,000 - 680,000Domestic workers travelling under illegal private employment agencies are particularly vulnerable as are girls who can be subjected to child marriage.Nigeria670,000 - 740,000An estimated 15.88% of the estimated total 29.6 million people in modern slavery are in Sub-Saharan Africa.Bangladesh330,000 - 360,000Large numbers of women and girls are reportedly trafficked to India and Pakistan annually and children, including boys, are exploited and trafficked for sex and labour.Democratic Republic of Congo440,000 - 490,000One of the world's poorest countries, despite a wealth of resources; 90% of men working in mines in eastern DRC are trapped by debt bondage.India13,300,000 - 14,700,000Men, women and children, many enslaved"</li></ul> |
144
+
145
+ ## Evaluation
146
+
147
+ ### Metrics
148
+ | Label | Accuracy |
149
+ |:--------|:---------|
150
+ | **all** | 1.0 |
151
+
152
+ ## Uses
153
+
154
+ ### Direct Use for Inference
155
+
156
+ First install the SetFit library:
157
+
158
+ ```bash
159
+ pip install setfit
160
+ ```
161
+
162
+ Then you can load this model and run inference.
163
+
164
+ ```python
165
+ from setfit import SetFitModel
166
+
167
+ # Download from the 🤗 Hub
168
+ model = SetFitModel.from_pretrained("JoshELambert/forced-labor")
169
+ # Run inference
170
+ preds = model(" in their own villages by debt bondage or born ito slavery, work in construction, textiles, brick-making, mines, fish and prawn processing and hospitality.Russia490,000 - 540,000Migrant workers endure extortion and physical abuse; anecdotal evidence suggests that forced labour camps still operate in Siberia.China2,800,000 - 3,100,000Severe forced labour in brick kilns in the north; forced labour in modern industries including fashion and computer supply chains.Myanmar360,000 - 400,000Slavery includes reports of deceptive recruitment of women for sale as brides in China, forced labour of adults on plantations and in industry and forced labour of children in tea shops, home industries and as beggars.Thailand450,000 - 500,000An explosion in global demand for seafood has led to an increased need for cheap migrant labour, including on fishing boats. High numbers of children are exploited, particulary those from ethnic minorities and hill tribes.SOURCE: THE GLOBAL SLAVERY INDEX 2013")
171
+ ```
172
+
173
+ <!--
174
+ ### Downstream Use
175
+
176
+ *List how someone could finetune this model on their own dataset.*
177
+ -->
178
+
179
+ <!--
180
+ ### Out-of-Scope Use
181
+
182
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
183
+ -->
184
+
185
+ <!--
186
+ ## Bias, Risks and Limitations
187
+
188
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
189
+ -->
190
+
191
+ <!--
192
+ ### Recommendations
193
+
194
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
195
+ -->
196
+
197
+ ## Training Details
198
+
199
+ ### Training Set Metrics
200
+ | Training set | Min | Median | Max |
201
+ |:-------------|:----|:---------|:----|
202
+ | Word count | 50 | 189.8442 | 221 |
203
+
204
+ | Label | Training Sample Count |
205
+ |:------|:----------------------|
206
+ | 0 | 8 |
207
+ | 1 | 69 |
208
+
209
+ ### Training Hyperparameters
210
+ - batch_size: (16, 16)
211
+ - num_epochs: (4, 4)
212
+ - max_steps: -1
213
+ - sampling_strategy: oversampling
214
+ - body_learning_rate: (2e-05, 1e-05)
215
+ - head_learning_rate: 0.01
216
+ - loss: CosineSimilarityLoss
217
+ - distance_metric: cosine_distance
218
+ - margin: 0.25
219
+ - end_to_end: False
220
+ - use_amp: False
221
+ - warmup_proportion: 0.1
222
+ - l2_weight: 0.01
223
+ - seed: 42
224
+ - eval_max_steps: -1
225
+ - load_best_model_at_end: True
226
+
227
+ ### Training Results
228
+ | Epoch | Step | Training Loss | Validation Loss |
229
+ |:------:|:----:|:-------------:|:---------------:|
230
+ | 0.0033 | 1 | 0.1808 | - |
231
+ | 0.1629 | 50 | 0.1363 | - |
232
+ | 0.3257 | 100 | 0.0103 | - |
233
+ | 0.4886 | 150 | 0.0019 | - |
234
+ | 0.6515 | 200 | 0.0005 | - |
235
+ | 0.8143 | 250 | 0.0001 | - |
236
+ | 0.9772 | 300 | 0.0 | - |
237
+ | 1.0 | 307 | - | 0.0407 |
238
+ | 1.1401 | 350 | 0.0001 | - |
239
+ | 1.3029 | 400 | 0.0 | - |
240
+ | 1.4658 | 450 | 0.0 | - |
241
+ | 1.6287 | 500 | 0.0 | - |
242
+ | 1.7915 | 550 | 0.0 | - |
243
+ | 1.9544 | 600 | 0.0 | - |
244
+ | 2.0 | 614 | - | 0.0272 |
245
+ | 2.1173 | 650 | 0.0 | - |
246
+ | 2.2801 | 700 | 0.0 | - |
247
+ | 2.4430 | 750 | 0.0 | - |
248
+ | 2.6059 | 800 | 0.0 | - |
249
+ | 2.7687 | 850 | 0.0 | - |
250
+ | 2.9316 | 900 | 0.0 | - |
251
+ | 3.0 | 921 | - | 0.0238 |
252
+ | 3.0945 | 950 | 0.0 | - |
253
+ | 3.2573 | 1000 | 0.0 | - |
254
+ | 3.4202 | 1050 | 0.0 | - |
255
+ | 3.5831 | 1100 | 0.0 | - |
256
+ | 3.7459 | 1150 | 0.0 | - |
257
+ | 3.9088 | 1200 | 0.0 | - |
258
+ | 4.0 | 1228 | - | 0.0227 |
259
+
260
+ ### Framework Versions
261
+ - Python: 3.10.12
262
+ - SetFit: 1.1.0
263
+ - Sentence Transformers: 3.3.1
264
+ - Transformers: 4.42.2
265
+ - PyTorch: 2.5.1+cu121
266
+ - Datasets: 3.2.0
267
+ - Tokenizers: 0.19.1
268
+
269
+ ## Citation
270
+
271
+ ### BibTeX
272
+ ```bibtex
273
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
274
+ doi = {10.48550/ARXIV.2209.11055},
275
+ url = {https://arxiv.org/abs/2209.11055},
276
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
277
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
278
+ title = {Efficient Few-Shot Learning Without Prompts},
279
+ publisher = {arXiv},
280
+ year = {2022},
281
+ copyright = {Creative Commons Attribution 4.0 International}
282
+ }
283
+ ```
284
+
285
+ <!--
286
+ ## Glossary
287
+
288
+ *Clearly define terms in order to be accessible across audiences.*
289
+ -->
290
+
291
+ <!--
292
+ ## Model Card Authors
293
+
294
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
295
+ -->
296
+
297
+ <!--
298
+ ## Model Card Contact
299
+
300
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
301
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/paraphrase-mpnet-base-v2",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.42.2",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.42.2",
5
+ "pytorch": "2.5.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": null
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:35dd22aab581d37d351bff3ac96d54a2cc54c4a05b7aeca71654fbf5b38e130a
3
+ size 437967672
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:41aec1f7602d3f89ae5e5a655952c4bf778331f81d203fd6aa1c55f4d18b1be6
3
+ size 7007
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "104": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "30526": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "<s>",
47
+ "do_basic_tokenize": true,
48
+ "do_lower_case": true,
49
+ "eos_token": "</s>",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 512,
52
+ "never_split": null,
53
+ "pad_token": "<pad>",
54
+ "sep_token": "</s>",
55
+ "strip_accents": null,
56
+ "tokenize_chinese_chars": true,
57
+ "tokenizer_class": "MPNetTokenizer",
58
+ "unk_token": "[UNK]"
59
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff