hbseong Seanie-lee commited on
Commit
2af2c7a
·
verified ·
1 Parent(s): d7efe3a

Upload README.md (#2)

Browse files

- Upload README.md (7121616c019e3f9fbf2bfc582635ac3be87ca677)


Co-authored-by: Seanie Lee <[email protected]>

Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -16,7 +16,7 @@ library_name: transformers
16
 
17
  Our model functions as a Guard Model, intended to classify the safety of conversations with LLMs and protect against LLM jailbreak attacks.
18
  It is fine-tuned from DeBERTa-v3-large and trained using **HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models**.
19
- The training process involves knowledge distillation paired with data augmentation, using our [**HarmAug Generated Dataset**](https://drive.google.com/drive/folders/1oLUMPauXYtEBP7rvbULXL4hHp9Ck_yqg?usp=drive_link).
20
 
21
 
22
  For more information, please refer to our [github](https://github.com/imnotkind/HarmAug)
@@ -44,7 +44,7 @@ model.eval()
44
  # If response is not given, the model will predict the unsafe score of the prompt.
45
  # If response is given, the model will predict the unsafe score of the response.
46
  def predict(model, prompt, response=None):
47
- device = model.device()
48
  if response == None:
49
  inputs = tokenizer(prompt, return_tensors="pt")
50
  else:
 
16
 
17
  Our model functions as a Guard Model, intended to classify the safety of conversations with LLMs and protect against LLM jailbreak attacks.
18
  It is fine-tuned from DeBERTa-v3-large and trained using **HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models**.
19
+ The training process involves knowledge distillation paired with data augmentation, using our [**HarmAug Generated Dataset**].
20
 
21
 
22
  For more information, please refer to our [github](https://github.com/imnotkind/HarmAug)
 
44
  # If response is not given, the model will predict the unsafe score of the prompt.
45
  # If response is given, the model will predict the unsafe score of the response.
46
  def predict(model, prompt, response=None):
47
+ device = model.device
48
  if response == None:
49
  inputs = tokenizer(prompt, return_tensors="pt")
50
  else: