trl-internal-testing/descriptiveness-sentiment-trl-style Viewer • Updated Apr 9, 2024 • 10.9k • 3.34k • 1
argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 6.77k • 127
csarron/argilla-ultrafeedback-binarized-preferences-cleaned Viewer • Updated Apr 2, 2024 • 62.9k • 80 • 1
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_helpfulness Viewer • Updated Jun 12, 2024 • 60.9k • 34
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_truthfulness Viewer • Updated Jun 12, 2024 • 60.9k • 35
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_instruction_following Viewer • Updated Jun 12, 2024 • 60.9k • 35 • 3
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_honesty Viewer • Updated Jun 12, 2024 • 60.9k • 33
cyberagent/chatbot-arena-ja-calm2-7b-chat-experimental Viewer • Updated Aug 15, 2024 • 29.2k • 110 • 19
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_pythia6.9b Viewer • Updated Jun 20, 2024 • 177k • 33
yaswanthchittepu/ultrafeedback-binarized-standard-margin-data-full Viewer • Updated Jul 7, 2024 • 63.7k • 31
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_pythia1b Viewer • Updated May 16, 2024 • 177k • 29
yaswanthchittepu/ultrafeedback-binarized-pop-margin-data-full Viewer • Updated Jul 7, 2024 • 63.7k • 32
YYYYYYibo/ultrafeedback_binarized_with_response_full_part2 Viewer • Updated Jul 17, 2024 • 21.1k • 35
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144 Viewer • Updated May 13, 2024 • 179k • 30
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1 Viewer • Updated May 20, 2024 • 20k • 31
when2rl/distilabel-intel-orca-dpo-pairs_cleaned_reformatted Viewer • Updated Apr 17, 2024 • 12.8k • 31
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706887192 Viewer • Updated Feb 2, 2024 • 405 • 32
YYYYYYibo/ultrafeedback_binarized_rank4_all_vllm_part_3_mini_0 Viewer • Updated May 5, 2024 • 5.08k • 30
argilla/ultrafeedback-multi-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 158k • 115 • 6
YYYYYYibo/ultrafeedback_binarized_rank4_all_vllm_part_3_mini_1 Viewer • Updated May 5, 2024 • 5.28k • 30
YYYYYYibo/ultrafeedback_binarized_rank4_all_vllm_part_3_mini_2 Viewer • Updated May 5, 2024 • 5.08k • 28
YYYYYYibo/ultrafeedback_binarized_rank4_all_vllm_part_3_mini_3 Viewer • Updated May 5, 2024 • 5.09k • 30
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1 Viewer • Updated May 6, 2024 • 19.6k • 29
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2 Viewer • Updated May 6, 2024 • 20.7k • 29
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0 Viewer • Updated May 6, 2024 • 19.6k • 29
ShenaoZ/0.001_3iters_bs128_declr_nodpo_zephyrbeta_userresponse_dataset Viewer • Updated Apr 26, 2024 • 67.1k • 30
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1707245027 Viewer • Updated Feb 7, 2024 • 1M • 33
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_lora-sft-finetuned-stage4-iter86000 Viewer • Updated May 22, 2024 • 20.8k • 29
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2 Viewer • Updated Jun 17, 2024 • 20k • 30
giux78/50000-60900-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 17, 2024 • 10.9k • 30
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_1_mini_3 Viewer • Updated May 19, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_minpi_part_2 Viewer • Updated Jun 17, 2024 • 20k • 30
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3 Viewer • Updated Jun 18, 2024 • 21.1k • 32
alvarobartt/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 20, 2023 • 155k • 31
quirky-lats-at-mats/NORMAL_BACKDOOR_alpaca_sleeper_agents_toy_safety_NOT_TRUNCATED_v4 Viewer • Updated Mar 11, 2024 • 2.83k • 28
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3 Viewer • Updated May 20, 2024 • 21.1k • 30
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_0 Viewer • Updated Jun 17, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_minpi_part_3 Viewer • Updated Jun 18, 2024 • 21.1k • 32
reshinthadith/pairwise-code-review-instruct-critique-revision-python Viewer • Updated Jan 9, 2023 • 5.24k • 48 • 9
NickyNicky/neovalle_H4rmony_dpo_translated_English_to_Spanish Viewer • Updated May 17, 2024 • 2.02k • 36 • 4
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330973 Viewer • Updated Feb 7, 2024 • 167 • 35
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_minpi_part_2 Viewer • Updated May 7, 2024 • 19.1k • 28
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_3 Viewer • Updated May 9, 2024 • 4.85k • 28
arcee-ai/multiturn-capybara-preferences-filtered-binarized Viewer • Updated May 18, 2024 • 14.8k • 38
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_1 Viewer • Updated May 20, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_2 Viewer • Updated May 20, 2024 • 5k • 29
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_0 Viewer • Updated May 20, 2024 • 5.28k • 29
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3 Viewer • Updated May 20, 2024 • 21.1k • 32
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized Viewer • Updated Jun 12, 2024 • 60.9k • 42
YYYYYYibo/ultrafeedback_binarized_ave_pi_vllm_part_3_mini_2 Viewer • Updated Jun 17, 2024 • 5.28k • 31
YYYYYYibo/ultrafeedback_binarized_ave_pi_vllm_part_3_mini_3 Viewer • Updated Jun 17, 2024 • 5.29k • 31
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_0 Viewer • Updated Jun 17, 2024 • 5k • 31
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_3 Viewer • Updated Jun 17, 2024 • 5.29k • 30
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_3 Viewer • Updated Jun 18, 2024 • 5.29k • 33
y1xing/orpo_llama3_concatenated_data_with_chris_examples_orpo_instruct_dataset Viewer • Updated Jul 6, 2024 • 2.64k • 33
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 41 • 4
mii-community/ultrafeedback-preferences-translated-ita Viewer • Updated Feb 21, 2024 • 60.9k • 39 • 3
NickyNicky/DIBT_prompts_ranked_En_Es_orpo_dpo_chatML_gemma_V3 Viewer • Updated May 14, 2024 • 20.4k • 33 • 1
NickyNicky/nano_finance_200k_en_es_chatML_gemma_orpo_dpo Viewer • Updated May 29, 2024 • 201k • 40 • 1
giux78/10000-20000-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 16, 2024 • 10k • 30
giux78/20000-50000-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 17, 2024 • 30k • 32
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1706885434 Viewer • Updated Feb 2, 2024 • 24 • 30
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706903049 Viewer • Updated Feb 2, 2024 • 167 • 28
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707331096 Viewer • Updated Feb 7, 2024 • 87 • 35
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707331527 Viewer • Updated Feb 7, 2024 • 462 • 37
vwxyzjn/openhermes-dev__meta-llama_Llama-2-70b-chat-hf__1707332943 Viewer • Updated Feb 7, 2024 • 167 • 36
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo_temp0.7 Viewer • Updated Apr 7, 2024 • 20k • 40
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo_temp0.7_length128 Viewer • Updated Apr 14, 2024 • 20k • 33
mnoukhov/summarize_from_feedback_tldr3_labelled_vllm_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19, 2024 • 9.5k • 31
ShenaoZ/0.001_3iters_bs128_declr_nodpo_useresponse_dataset Viewer • Updated Apr 26, 2024 • 67.1k • 33
ShenaoZ/0.001_4iters_bs128_declr_nodpo_useresponse_dataset Viewer • Updated Apr 26, 2024 • 69.1k • 28
ShenaoZ/0.001_3iters_bs256_declr_nodpo_userresponse_dataset Viewer • Updated Apr 26, 2024 • 67.1k • 37
ShenaoZhang/0.001_4iters_bs128_nodpo_only4w_userresponse_dataset Viewer • Updated Apr 27, 2024 • 48k • 36
ShenaoZhang/0.001_4iters_bs256_nodpo_only4w_userresponse_dataset Viewer • Updated Apr 27, 2024 • 48k • 33
ShenaoZhang/0.0001_3iters_bs256_nodpo_full6w_userresponse_dataset Viewer • Updated Apr 29, 2024 • 46.8k • 32
nnheui/stablelm-2-1_6b-sft-full-ultrafeedback_generated Viewer • Updated Apr 30, 2024 • 61.1k • 31 • 1
ShenaoZ/0.0001_withdpo_4iters_bs256_5102lr_misit_correct_dataset Viewer • Updated May 4, 2024 • 51.8k • 33
YYYYYYibo/ultrafeedback_binarized_rank4_all_vllm_part_2_mini_3 Viewer • Updated May 4, 2024 • 5k • 31
YYYYYYibo/ultrafeedback_binarized_rank4_all_vllm_part_2_mini_1 Viewer • Updated May 4, 2024 • 5k • 28
YYYYYYibo/ultrafeedback_binarized_rank4_all_vllm_part_2_mini_0 Viewer • Updated May 4, 2024 • 4.8k • 29
YYYYYYibo/ultrafeedback_binarized_real_rank4_all_minpi_part_3 Viewer • Updated May 5, 2024 • 20.5k • 29
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo_costa_2.8b_bf16.yml_6e799_new Viewer • Updated May 5, 2024 • 20k • 34
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_3 Viewer • Updated May 6, 2024 • 4.9k • 31
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_2 Viewer • Updated May 6, 2024 • 4.9k • 30
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_3 Viewer • Updated May 6, 2024 • 5.19k • 30
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_1 Viewer • Updated May 6, 2024 • 5.18k • 30
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_2_mini_2 Viewer • Updated May 7, 2024 • 4.1k • 30
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_2_mini_1 Viewer • Updated May 7, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_0 Viewer • Updated May 7, 2024 • 4.78k • 30
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_3 Viewer • Updated May 8, 2024 • 5.09k • 29
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_minpi_part_3 Viewer • Updated May 8, 2024 • 20.7k • 30
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_2 Viewer • Updated May 8, 2024 • 4.4k • 28
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_1 Viewer • Updated May 8, 2024 • 5k • 29
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_minpi_part_2 Viewer • Updated May 8, 2024 • 19.4k • 28
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_2 Viewer • Updated May 9, 2024 • 4.85k • 35
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_1 Viewer • Updated May 9, 2024 • 4.85k • 29
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_0 Viewer • Updated May 9, 2024 • 5.16k • 30
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_1 Viewer • Updated May 9, 2024 • 5.16k • 29
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr Viewer • Updated May 17, 2024 • 107k • 30
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr-step873 Viewer • Updated May 12, 2024 • 20k • 40
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_1_mini_0 Viewer • Updated May 19, 2024 • 5k • 31
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_1_mini_2 Viewer • Updated May 19, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_3 Viewer • Updated May 20, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_2 Viewer • Updated May 20, 2024 • 5k • 29
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_1 Viewer • Updated May 20, 2024 • 5.28k • 29
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v2_full-sft-finetuned-stage4-iter86000-v2 Viewer • Updated May 23, 2024 • 18.8k • 30
BahaaEldin0/openai_summarize_comparisons_dataset_with_prompts_2_percent Viewer • Updated May 30, 2024 • 4.69k • 31
YYYYYYibo/ultrafeedback_binarized_ave_pi_vllm_part_3_mini_1 Viewer • Updated Jun 17, 2024 • 5.28k • 30
YYYYYYibo/ultrafeedback_binarized_ave_pi_vllm_part_3_mini_0 Viewer • Updated Jun 17, 2024 • 5.28k • 32
YYYYYYibo/ultrafeedback_binarized_ave_pi_train_part_3_mini_0 Viewer • Updated Jun 17, 2024 • 5.28k • 29
YYYYYYibo/ultrafeedback_binarized_ave_pi_train_part_3_mini_1 Viewer • Updated Jun 17, 2024 • 5.28k • 34
YYYYYYibo/ultrafeedback_binarized_ave_pi_train_part_3_mini_2 Viewer • Updated Jun 17, 2024 • 5.28k • 31
YYYYYYibo/ultrafeedback_binarized_ave_pi_train_part_3_mini_3 Viewer • Updated Jun 17, 2024 • 5.29k • 31
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_2 Viewer • Updated Jun 17, 2024 • 5k • 28
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_1 Viewer • Updated Jun 17, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_3 Viewer • Updated Jun 17, 2024 • 5k • 28
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_2 Viewer • Updated Jun 17, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_0 Viewer • Updated Jun 17, 2024 • 5.28k • 30
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_2 Viewer • Updated Jun 17, 2024 • 5.28k • 32
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_0 Viewer • Updated Jun 18, 2024 • 5.28k • 29
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_1 Viewer • Updated Jun 18, 2024 • 5.28k • 29
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_2 Viewer • Updated Jun 18, 2024 • 5.28k • 27
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel2_llama8b Viewer • Updated Jun 19, 2024 • 92.1k • 34
giux78/ultrafeedback-binarized-preferences-cleaned-ita-ready Viewer • Updated Jan 18, 2024 • 60.9k • 32 • 2
giux78/ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 27, 2024 • 60.9k • 33 • 1
ZHLiu627/ultrafeedback_binarized_with_response_full_part1 Viewer • Updated Mar 8, 2024 • 20k • 33 • 1
NickyNicky/Colossal_Translation_Spanish_to_English_AND_English_to_Spanish_ORPO_DPO_Gemma Viewer • Updated May 6, 2024 • 3.4M • 35 • 3
arianhosseini/openai_summarize_comparisons_relabel_pythia1b_iter1_temp0.7 Viewer • Updated Dec 22, 2023 • 20k • 33
giux78/0-10000-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 16, 2024 • 10k • 30
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1706885528 Viewer • Updated Feb 2, 2024 • 24 • 32
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706886961 Viewer • Updated Feb 2, 2024 • 24 • 32
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706887930 Viewer • Updated Feb 2, 2024 • 30 • 30
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706893611 Viewer • Updated Feb 2, 2024 • 84 • 31
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706896441 Viewer • Updated Feb 2, 2024 • 5 • 30
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330518 Viewer • Updated Feb 7, 2024 • 167 • 36
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330742 Viewer • Updated Feb 7, 2024 • 167 • 33
vwxyzjn/openhermes-dev__meta-llama_Llama-2-70b-chat-hf__1707333633 Viewer • Updated Feb 7, 2024 • 167 • 33
vwxyzjn/openhermes-dev__meta-llama_Llama-2-70b-chat-hf__1707337384 Viewer • Updated Feb 7, 2024 • 167 • 35
vwxyzjn/openhermes-dev__kaist-ai_prometheus-13b-v1.0__1707405480 Viewer • Updated Feb 8, 2024 • 167 • 39
vwxyzjn/openhermes-dev__kaist-ai_prometheus-13b-v1.0__1707406141 Viewer • Updated Feb 8, 2024 • 167 • 34
vwxyzjn/openhermes-dev__kaist-ai_prometheus-13b-v1.0__1707408224 Viewer • Updated Feb 8, 2024 • 167 • 32
vwxyzjn/openhermes-dev__kaist-ai_prometheus-13b-v1.0__1707422187 Viewer • Updated Feb 9, 2024 • 48.3k • 32
mnoukhov/openai_summarize_comparisons_tldprompt_relabel_pythia410m-dpo1 Viewer • Updated Feb 19, 2024 • 92.5k • 37
mnoukhov/openai_summarize_comparisons_tldrprompt_relabel1b_margin Viewer • Updated Feb 22, 2024 • 97.5k • 32
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo Viewer • Updated Feb 26, 2024 • 20k • 30
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo Viewer • Updated Feb 26, 2024 • 20k • 33
mnoukhov/openai_summarize_generated_20k_relabel_1b_predict_410m-dpo1 Viewer • Updated Feb 26, 2024 • 20k • 29
aengusl/noise0_alpaca_sleeper_agents_toy_test_preference_v4 Viewer • Updated Mar 11, 2024 • 15.7k • 31
davidberenstein1957/ultrafeedback-binarized-cleaned-and-filtered-random-split Viewer • Updated Mar 14, 2024 • 6.69k • 33
mnoukhov/summarize_from_feedback_tldr3_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b35a8 Viewer • Updated Apr 16, 2024 • 20k • 35
mnoukhov/summarize_from_feedback_tldr3_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 18, 2024 • 20k • 36
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19, 2024 • 107k • 39
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo_costa_2.8b_bf16.yml_6e799 Viewer • Updated Apr 22, 2024 • 107k • 31
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_0 Viewer • Updated Apr 24, 2024 • 10k • 29
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_1 Viewer • Updated Apr 24, 2024 • 10k • 31
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_2 Viewer • Updated Apr 24, 2024 • 10k • 31
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_3 Viewer • Updated Apr 24, 2024 • 10k • 30
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_4 Viewer • Updated Apr 24, 2024 • 10k • 30
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_5 Viewer • Updated Apr 24, 2024 • 10k • 30
ShenaoZ/0.001_ablation_4iters_bs256_nodpo_useresponse_dataset Viewer • Updated Apr 25, 2024 • 69.1k • 32
ShenaoZhang/0.01_4iters_bs256_nodpo_full6w_userresponse_dataset Viewer • Updated Apr 29, 2024 • 34.6k • 35
GENIAC-Team-Ozaki/chatbot-arena-ja-calm2-7b-chat-experimental_deduped Viewer • Updated May 2, 2024 • 23.3k • 35
YYYYYYibo/ultrafeedback_binarized_rank4_all_vllm_part_2_mini_2 Viewer • Updated May 4, 2024 • 3.8k • 29
ShenaoZ/0.0001_zephyrdpoinit_nodpo_3iters_bs256_555lr_dataset Viewer • Updated May 6, 2024 • 67.1k • 34
ShenaoZ/0.0001_zephyrgemmasft_withdpo_3iters_bs256_555lr_dataset Viewer • Updated May 6, 2024 • 24.4k • 31
ShenaoZ/0.0001_zephyrgemmadpo_nodpo_3iters_bs256_555lr_dataset Viewer • Updated May 6, 2024 • 22.4k • 29
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_0 Viewer • Updated May 6, 2024 • 4.9k • 30
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_2 Viewer • Updated May 6, 2024 • 4.9k • 30
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_1 Viewer • Updated May 6, 2024 • 4.9k • 28
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_1 Viewer • Updated May 6, 2024 • 4.9k • 30
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_0 Viewer • Updated May 6, 2024 • 5.18k • 31
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_2 Viewer • Updated May 6, 2024 • 5.18k • 32
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_2_mini_3 Viewer • Updated May 7, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_2_mini_0 Viewer • Updated May 7, 2024 • 5k • 29
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_2 Viewer • Updated May 7, 2024 • 4.78k • 30
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_3 Viewer • Updated May 7, 2024 • 4.78k • 29
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_1 Viewer • Updated May 7, 2024 • 4.78k • 30
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_0 Viewer • Updated May 8, 2024 • 5.28k • 29
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_1 Viewer • Updated May 8, 2024 • 5.28k • 33
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_0 Viewer • Updated May 8, 2024 • 5.18k • 33
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_2 Viewer • Updated May 8, 2024 • 5.18k • 27
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_3 Viewer • Updated May 8, 2024 • 5.19k • 31
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3 Viewer • Updated May 8, 2024 • 20.7k • 29
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_0 Viewer • Updated May 8, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2 Viewer • Updated May 9, 2024 • 19.4k • 32
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_2 Viewer • Updated May 9, 2024 • 4.98k • 31
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_3 Viewer • Updated May 9, 2024 • 5.09k • 27
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_1 Viewer • Updated May 9, 2024 • 5.28k • 30
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_0 Viewer • Updated May 9, 2024 • 5.28k • 31
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_minpi_part_3 Viewer • Updated May 9, 2024 • 20.6k • 29
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_3 Viewer • Updated May 9, 2024 • 5.16k • 30
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_2 Viewer • Updated May 9, 2024 • 5.16k • 29
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3 Viewer • Updated May 9, 2024 • 20.6k • 30
ShenaoZ/0.0005_mistral_withdpo_4iters_bs256_5551lr_dataset Viewer • Updated May 10, 2024 • 51.8k • 29
GENIAC-Team-Ozaki/chatbot-arena-ja-calm2-7b-chat-experimental_deduped_add_generated_text Viewer • Updated May 14, 2024 • 12k • 54
GENIAC-Team-Ozaki/chatbot-arena-ja-karakuri-lm-8x7b-chat-v0.1-awq Viewer • Updated May 17, 2024 • 12.5k • 116
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr_relabel_pythia1b Viewer • Updated May 17, 2024 • 107k • 33
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_1 Viewer • Updated May 20, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_2_mini_1 Viewer • Updated May 20, 2024 • 5k • 29
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_2_mini_3 Viewer • Updated May 20, 2024 • 5k • 29
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_2_mini_2 Viewer • Updated May 20, 2024 • 5k • 31
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_2_mini_0 Viewer • Updated May 20, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_3 Viewer • Updated May 20, 2024 • 5k • 29
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2 Viewer • Updated May 20, 2024 • 20k • 29
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_3 Viewer • Updated May 20, 2024 • 5.29k • 31
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_2 Viewer • Updated May 20, 2024 • 5.28k • 32
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_0 Viewer • Updated May 20, 2024 • 5.28k • 31
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_1 Viewer • Updated May 20, 2024 • 5.28k • 30
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_3 Viewer • Updated May 20, 2024 • 5.29k • 29
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_2 Viewer • Updated May 20, 2024 • 5.28k • 30
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v3_full-sft-finetuned-stage4-iter86000-v3 Viewer • Updated May 24, 2024 • 19.3k • 30
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v4_full-sft-finetuned-stage4-iter86000-v4 Viewer • Updated May 25, 2024 • 19.5k • 29
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_3 Viewer • Updated Jun 17, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_1 Viewer • Updated Jun 17, 2024 • 5k • 30
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_llama8b Viewer • Updated Jun 19, 2024 • 176k • 32
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706888126 Viewer • Updated Feb 2, 2024 • 84 • 30
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__temp Viewer • Updated Feb 6, 2024 • 600k • 37
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo_temp0.7 Viewer • Updated Apr 8, 2024 • 20k • 34
ShenaoZ/0.001_4iters_bs256_nodpo_only2third_userresponse_dataset Viewer • Updated Apr 26, 2024 • 12.2k • 28
YYYYYYibo/ultrafeedback_binarized_real_rank4_all_train_part_2 Viewer • Updated May 5, 2024 • 18.6k • 29
YYYYYYibo/ultrafeedback_binarized_real_rank4_all_train_part_3 Viewer • Updated May 5, 2024 • 20.5k • 28
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_3 Viewer • Updated May 6, 2024 • 4.9k • 30
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_0 Viewer • Updated May 20, 2024 • 5k • 30
BahaaEldin0/openai_summarize_comparisons_dataset_with_prompts Viewer • Updated May 30, 2024 • 260k • 28
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_1 Viewer • Updated Jun 17, 2024 • 5.28k • 30
vwxyzjn/openhermes-dev__kaist-ai_prometheus-13b-v1.0__1707406405 Viewer • Updated Feb 8, 2024 • 167 • 33
mnoukhov/openai_summarize_generated_20k_relabel_pythia410m-dpo1_margin Viewer • Updated Feb 22, 2024 • 20k • 32
quirky-lats-at-mats/NORMAL_BACKDOOR_alpaca_sleeper_agents_toy_safety_v4 Viewer • Updated Mar 11, 2024 • 2.83k • 29
aengusl/noise5_alpaca_sleeper_agents_toy_safety_NOT_TRUNCATED_v4 Viewer • Updated Mar 11, 2024 • 2.83k • 28
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo_temp0.7_length128 Viewer • Updated Apr 14, 2024 • 20k • 36
mnoukhov/summarize_from_feedback_tldr3_labelled_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19, 2024 • 9.5k • 35
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo2_costa_1b_fp16.yml_bfcef Viewer • Updated Apr 21, 2024 • 107k • 31
ShenaoZ/0.0001_zephyrgemmasft_withdpo_4iters_bs256_555lr_dataset Viewer • Updated May 6, 2024 • 51.8k • 35
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_0 Viewer • Updated May 6, 2024 • 4.9k • 32
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2 Viewer • Updated May 8, 2024 • 19.1k • 30
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_2 Viewer • Updated May 8, 2024 • 5.08k • 31
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_1 Viewer • Updated May 8, 2024 • 5.18k • 29
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_3 Viewer • Updated May 8, 2024 • 5k • 29
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr-step873_relabel_pythia1b Viewer • Updated May 13, 2024 • 20k • 32
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_1_mini_1 Viewer • Updated May 19, 2024 • 5k • 30
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_0 Viewer • Updated May 20, 2024 • 5k • 35
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_full-sft-finetuned-stage4-iter86000 Viewer • Updated May 22, 2024 • 20.3k • 29
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_0 Viewer • Updated May 9, 2024 • 4.85k • 29
ContextualAI/ultrabin_clean_max_chosen_rand_rejected_rationalized Viewer • Updated Jun 12, 2024 • 60.9k • 36