Edit Models filters

Inference status

Misc

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

8-bit precision

Mixture of Experts

text-embeddings-inference

Carbon Emissions

Models

4,427

Full-text search

Active filters: dpo

SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-DPO-D2a-distilabel-math-preference

Text Generation • Updated Sep 12, 2024 • 5

vincentlinzhu/dspv1_dpo_llemmafmt_medium

Updated Sep 12, 2024 • 1

DUAL-GPO/phi-2-dpo-chatml-lora-0k-20k-i2

Updated Sep 13, 2024

LBK95/Llama-2-7b-hf-DPO-LookAhead3_FullEval_TTree1.4_TLoop0.7_TEval0.2_Filter0.2_V1.0

Updated Sep 12, 2024 • 3

Huertas97/smollm-gec-sftt-dpo

Text Generation • Updated Sep 12, 2024 • 9

SameedHussain/gemma-2-2b-it-Flight-Multi-Turn-V2-DPO

Text Generation • Updated Sep 12, 2024 • 5

Siddartha10/outputs_dpo

Text Generation • Updated Sep 12, 2024 • 5

SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-DPO-D2a-HuggingFaceH4-ultrafeedback_binarized-Xlarge

Text Generation • Updated Sep 13, 2024 • 8

CharlesLi/OpenELM-1_1B-DPO-full-llama-improve-openelm

Text Generation • Updated Sep 13, 2024 • 4

maxmyn/c4ai-takehome-model-dpo

Text Generation • Updated Sep 15, 2024 • 6

CharlesLi/OpenELM-1_1B-DPO-full-max-4-reward

Text Generation • Updated Oct 7, 2024 • 4

CharlesLi/OpenELM-1_1B-DPO-full-max-12-reward

Text Generation • Updated Oct 7, 2024 • 6

DUAL-GPO/phi-2-ipo-chatml-lora-i1

Updated Sep 14, 2024 • 4

DUAL-GPO/phi-2-ipo-chatml-lora-10k-30k-i1

Updated Sep 14, 2024

DUAL-GPO/phi-2-ipo-chatml-lora-20k-40k-i1

Updated Sep 14, 2024 • 6

DUAL-GPO/phi-2-ipo-chatml-lora-30k-50k-i1

Updated Sep 14, 2024

rasyosef/phi-2-apo

Updated Sep 16, 2024 • 13

LBK95/Llama-2-7b-hf-DPO-LookAhead3_FullEval_TTree1.4_TLoop0.7_TEval0.2_Filter0.2_V2.0

Updated Sep 15, 2024

coscotuff/SLFT_Trials_2

Text Generation • Updated Sep 16, 2024 • 6

preethu19/tiny-chatbot-dpo

Updated Sep 15, 2024

Avinaash/a100_epoch1IPOBest

Text Generation • Updated Sep 15, 2024 • 5

ravithejads/test_model_sft

Text Generation • Updated Sep 15, 2024 • 3

Avinaash/a100_epoch2IPOBest

Text Generation • Updated Sep 15, 2024 • 5

Avinaash/a100_epoch1DPOCurated

Text Generation • Updated Sep 15, 2024 • 5

Avinaash/a100_epoch3DPOCurated

Text Generation • Updated Sep 15, 2024 • 5

Avinaash/a100_epoch3IPOBest

Text Generation • Updated Sep 15, 2024 • 5

Avinaash/a100_epoch2DPOCurated

Text Generation • Updated Sep 15, 2024 • 5

sarthakrw/dpo_model

Text Generation • Updated Sep 15, 2024 • 6

VivekChauhan06/SmolLM-FT-CoEdIT-DPO

Text Generation • Updated Sep 15, 2024 • 117

Avinaash/beta0.3_LR_2e-05_Epoch1_DPO_CuratedDataset

Text Generation • Updated Sep 15, 2024 • 5