Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
Replicate
Nebius AI Studio
fal
Together AI
Novita
Fireworks
Hyperbolic
SambaNova
HF Inference API
Misc
Reset Misc
arxiv:
2305.18290
Inference Endpoints
AutoTrain Compatible
text-generation-inference
4-bit precision
Eval Results
8-bit precision
custom_code
Merge
text-embeddings-inference
Misc with no match
Carbon Emissions
Mixture of Experts
Apply filters
Models
1,182
Full-text search
Edit filters
Sort: Trending
Active filters:
2305.18290
Clear all
llavallava/smolvlm-instruct-trl-dpo-0_0.1_epochs1_ref
Image-Text-to-Text
•
Updated
Jan 30
•
2
llavallava/smolvlm-instruct-trl-dpo-0_0.5_qa_epochs1_ref
Image-Text-to-Text
•
Updated
Jan 30
•
48
llavallava/qwen2vl2b-instruct-trl-dpo-0_0.5_qa_epochs1_ref
Image-Text-to-Text
•
Updated
Jan 30
•
26
llavallava/qwen2vl2b-instruct-trl-dpo-0_0.1_epochs1_nonref
Image-Text-to-Text
•
Updated
Jan 30
•
42
prosecalign/phi3m0128-cds-0.7-kendall-onof-ofif-corr-max-2-simpo-max1500-default
Updated
Jan 31
prosecalign/phi3m0128-cds-0.75-kendall-onof-ofif-corr-max-2-simpo-max1500-default
Updated
Jan 31
prosecalign/phi3m0128-cds-0.85-kendall-onof-ofif-corr-max-2-simpo-max1500-default
Updated
Jan 31
prosecalign/phi3m0128-cds-0.65-kendall-onof-ofif-corr-max-2-simpo-max1500-default
Updated
Jan 31
prosecalign/phi3m0128-wds-0.9-kendall-onof-ofif-corr-max-2-simpo-max1500-default
Updated
Jan 31
prosecalign/phi3m0128-wds-0.85-kendall-onof-ofif-corr-max-2-simpo-max1500-default
Updated
Jan 31
prosecalign/phi3m0128-wds-0.75-kendall-onof-ofif-corr-max-2-simpo-max1500-default
Updated
Jan 31
prosecalign/phi3m0128-cds-0.5-kendall-onof-ofif-corr-max-2-simpo-max1500-default
Updated
Jan 31
prosecalign/phi3m0128-cds-0.3-kendall-onof-ofif-corr-max-2-simpo-max1500-default
Updated
Jan 31
RyanYr/reflect_mini8B_Om2SftT2_om2-20to40kIpsdpIter2T02_b0.5
Text Generation
•
Updated
Jan 31
•
62
RyanYr/reflect_mini8B_Om2SftT1-om2-20to40kIpsdpIter2T1_b0.5
Text Generation
•
Updated
Jan 31
•
9
prosecalign/phi3m0128-cds-0.1-kendall-onof-ofif-corr-max-2-simpo-max1500-default
Updated
Jan 31
RyanYr/reflect_mini8B_Om2SftT2_om2-20to40kIpsdpIter2T02_b1.0
Text Generation
•
Updated
Jan 31
•
11
prosecalign/phi3m0128-cds-0.8-kendall-onof-decrease-corr-max-2-simpo-max1500-default
Updated
Jan 31
prosecalign/phi3m0128-cds-0.8-kendall-on-neg_if-corr-max-2-simpo-max1500-default
Updated
Jan 31
prosecalign/phi3m0128-cds-0.8-kendall-onof-neg_if-corr-max-2-simpo-max1500-default
Updated
Jan 31
PavanMV/SmolLM2-FT-Python-DPO
Text Generation
•
Updated
Jan 31
•
4
tommykoctur/SmolLM2-FT-DPO
Text Generation
•
Updated
Jan 31
•
12
rmeireles/SmolLM2-FT-DPO
Text Generation
•
Updated
Jan 31
•
10
nicoboss/DeepSeek-R1-Distill-Llama-70B-Uncensored-v2-Unbiased
Updated
21 days ago
•
2
CloudMonica/SmolLM2-FT-DPO
Text Generation
•
Updated
Feb 1
•
6
ubermenchh/SmolLM2-DPO
Text Generation
•
Updated
Feb 1
•
10
mehmetkeremturkcan/SmollerLM-20M-Instruct-Pruned-sft5-dpo
Text Generation
•
Updated
Feb 2
•
30
ubermenchh/SmolLM2-DPO-ultrafeedback-binarized-preferences
Text Generation
•
Updated
Feb 2
•
15
mehmetkeremturkcan/SmollerLM-20M-Instruct-Pruned-sft5-dpo3
Text Generation
•
Updated
Feb 2
•
81
AmberYifan/Llama-2-7b-sft-hhrlhf-gen-dpo
Text Generation
•
Updated
Feb 2
•
7
Previous
1
...
30
31
32
33
34
...
40
Next