Trained with Tevatron reranker branch;

script:

epoch=3
bs=32
gradient_accumulation_steps=8
real_bs=$(( $bs / $gradient_accumulation_steps ))

CUDA_VISIBLE_DEVICES=0 python examples/reranker/reranker_train.py \
  --output_dir reranker_xlmr.bs-$bs.epoch-$epoch \
  --model_name_or_path xlm-roberta-large \
  --save_steps 20000 \
  --dataset_name Tevatron/msmarco-passage \
  --fp16 \
  --per_device_train_batch_size $real_bs \
  --gradient_accumulation_steps $gradient_accumulation_steps \
  --train_n_passages 8 \
  --learning_rate 5e-6 \
  --q_max_len 16 \
  --p_max_len 128 \
  --num_train_epochs $epoch \
  --logging_steps 500 \
  --dataloader_num_workers 4 \
  --overwrite_output_dir
Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train crystina-z/monoXLMR.pft-msmarco