RuBERTConv Toxic Editor

Model description

Tagging model for detoxification based on rubert-base-cased-conversational.

4 possible classes:

  • Equal = save tokens
  • Replace = replace tokens with mask
  • Delete = remove tokens
  • Insert = insert mask before tokens

Use in pair with mask filler.

Intended uses & limitations

How to use

Colab: link

import torch
from transformers import AutoTokenizer, pipeline

tagger_model_name = "IlyaGusev/rubertconv_toxic_editor"

device = "cuda" if torch.cuda.is_available() else "cpu"
device_num = 0 if device == "cuda" else -1
tagger_pipe = pipeline(
    "token-classification",
    model=tagger_model_name,
    tokenizer=tagger_model_name,
    framework="pt",
    device=device_num,
    aggregation_strategy="max"
)

text = "..."
tagger_predictions = tagger_pipe([text], batch_size=1)
sample_predictions = tagger_predictions[0]
print(sample_predictions)

Training data

Training procedure

Eval results

TBA

Downloads last month
167
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.