astromis's picture
[doc] added datasert link
bcb833c
metadata
license: mit
language:
  - ru
metrics:
  - f1
library_name: transformers
tags:
  - russian
  - conversation
  - chats
  - embeddings
  - coherence

Model Card

This model is trained to predict whether two given messages from some group chat with many members can have a reply_to relationship.

Training details

It's based on Conversational RuBERT (cased, 12-layer, 768-hidden, 12-heads, 180M parameters) that was trained on several social media datasets. We fine-tuned it with the data from several Telegram chats. The positive reply_to examples were obtained by natural user annotation. The negative ones were obtained by shuffling the messages. The task perfectly aligns with the Next Sentence Prediction task, so the fine-tuning was done in that manner.

It achieves the 0.83 F1 score on the gold test set from our reply recovery dataset.

See the paper for more details.

Usage

Note: if two messages have reply_to relationship, then they have "zero" label. This is because of the NSP formulation.

from transformers import AutoTokenizer, BertForNextSentencePrediction
tokenizer = AutoTokenizer.from_pretrained("astromis/rubert_reply_recovery", )
model = BertForNextSentencePrediction.from_pretrained("rubert_reply_recovery", )

inputs = tokenizer(['Где можно получить СНИЛС?', 'Я тут уже много лет'], ["Можете в МФЦ", "Куда отправить это письмо?"], return_tensors='pt',
                                truncation=True, max_length=512, padding = 'max_length',)
output = model(**inputs)
print(output.logits.argmax(dim=1))
# tensor([0, 1])

Citation

@article{Buyanov2023WhoIA,
  title={Who is answering to whom? Modeling reply-to relationships in Russian asynchronous chats},
  author={Igor Buyanov and Darya Yaskova and Ilya Sochenkov},
  journal={Computational Linguistics and Intellectual Technologies},
  year={2023}
}