hbseong commited on
Commit
8f8c7e2
·
verified ·
1 Parent(s): d5fc498

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - deberta-v3
4
+ - deberta
5
+ - deberta-v2
6
+ license: mit
7
+ base_model:
8
+ - microsoft/deberta-v3-large
9
+ pipeline_tag: text-classification
10
+ ---
11
+
12
+ # HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
13
+
14
+
15
+
16
+ Our model functions as a Guard Model, intended to classify the safety of conversations with LLMs and protect against LLM jailbreak attacks.
17
+ It is fine-tuned from DeBERTa-v3-large and trained using **HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models**.
18
+ The training process involves knowledge distillation paired with data augmentation, using our [**HarmAug Generated Dataset**].
19
+
20
+
21
+ For more information, please refer to our [github](https://github.com/imnotkind/HarmAug)
22
+
23
+
24
+
25
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66f7bee63c7ffa79319b053b/bCNW62CvDpqbXUK4eZ4-b.png)
26
+
27
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66f7bee63c7ffa79319b053b/REbNDOhT31bv_XRa6-VzE.png)