freewheelin
/

free-evo-qwen72b-v0.8-re

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

free-evo-qwen72b-v0.8-re / README.md

freewheelin's picture

Update README.md

f264b55 verified 9 months ago

|

781 Bytes

metadata

language:
  - ko
  - en
license: mit

Model Card for free-evo-qwen72b-v0.8

1st place : 2024 4th May - avg. 81.28

Open Llm Leaderboard but this kicked away. maybe the explanation was not enough. i don't care.

Method

We were inspired by this Sakana project

Process

1. two models with the same architecture are needed so fine-tune a model to create a gap between the two of them.

2. merge original one and fine-tuned one

3. evaluate the merged model

4. merge again it with original model

5. evaluate again

6. keep going until evaluate avg is higher then original one

that's it. simple.

Base Architecture

QWEN2