arxiv:2405.07863
Wei Xiong
weqweasdas
AI & ML interests
Machine learning, RLHF
Recent Activity
updated
a dataset
34 minutes ago
weqweasdas/llama3_it_gen_tmp10_gold_tmpexp_prompt_tmp0_gen
updated
a dataset
about 1 hour ago
weqweasdas/llama3_non_delete_rr40k_2e6_bz32_ep3tmp10_temp_exp_genbytmp0_3
updated
a dataset
about 1 hour ago
weqweasdas/llama3_non_delete_rr40k_2e6_bz32_ep3tmp10_temp_exp_genbytmp0_2
Organizations
models
23
weqweasdas/zephyr-7b-dpo-full
Text Generation
•
Updated
•
17
weqweasdas/zephyr-7b-gemma-dpo
Updated
weqweasdas/zephyr-7b-sft-full
Updated
weqweasdas/zephyr-7b-dpo-qlora
Updated
weqweasdas/gpt2-cpt-dutch
Text Generation
•
Updated
•
74
weqweasdas/zephyr-7b-gemma-sft
Updated
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6_weight085
Text Generation
•
Updated
•
12
weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6
Text Generation
•
Updated
•
12
weqweasdas/raft_baseline_zephyr_packing_model6
Text Generation
•
Updated
•
15
weqweasdas/raft_baseline_openchat_llama13b_model1
Text Generation
•
Updated
•
15
datasets
146
weqweasdas/llama3_it_gen_tmp10_gold_tmpexp_prompt_tmp0_gen
Updated
weqweasdas/llama3_non_delete_rr40k_2e6_bz32_ep3tmp10_temp_exp_genbytmp0_3
Updated
weqweasdas/llama3_non_delete_rr40k_2e6_bz32_ep3tmp10_temp_exp_genbytmp0_2
Updated
weqweasdas/llama3_non_delete_rr40k_2e6_bz32_ep3tmp10_temp_exp_genbytmp07
Updated
weqweasdas/llama3_non_delete_rr40k_2e6_bz32_ep3tmp10_temp_exp_genbytmp0
Updated
weqweasdas/llama3_non_delete_rr40k_2e6_bz32_ep3tmp10_temp_exp_genbytmp
Updated
weqweasdas/xxx
Updated
weqweasdas/Hanning_Llama3-sft-less-corr-rr60k-2eptmp07
Updated
weqweasdas/Hanning_Llama3-sft-less-corr-rr60k-2eptmp10
Updated
weqweasdas/Hanning_Llama3-sft-less-corr-rr60k-3eptmp07
Updated