Thanks for mentioning the individual input models
As the title says! And congrats on the final result :)!
Thank you! Merging like this is balancing on the shoulders of many giants - and considerable trial and error. For Vimarckoso, I settled on iterative rounds of LoRAs, DELLAs, dare_ties, and SLERPs to get this.
Yours have been among the mergekit YAMLs I've studied to get this result. Good job on Broca! Are we going to give Qwen 32B a run for its money or what?
I made a Python script and am working on a small leaderboard comparison tool (see my space: tiny leaderboard. I think the heatmap espcially is really nice as well as the scrape all mergekit configurations options :) ) still a work in progress though!
At some point I want to work on della_linear as a basis for evolutionary merging. You've got great data from your dare_ties merges, and I think this will carry over!
That leaderboard is brilliant. Thank you for setting it up! I'd like to confirm my hunches about what makes our most successful models tick - and I think @jeffmeloy 's approaches with NER and minperplexity merging has been very successful on the 7B models and would likely offer a lot here, too.
(UPDATE) Surprise takeaway from the leaderboard - my impression of some merge results on the comparator led me to delete/offline them, but your leaderboard is showing where they had merit. I need to tighten up my eval process. Also, you, @sthenno , and I are very close in our results, suggesting we are using current methods well, each with our favored benchmark and features. I still think there's optimization to do to get a model ready to perform with minimal finetuning.
Hats off to everyone whose models appear here. I'm sure another round of high scores is soon to come. It's wild how close they are now! https://shorturl.at/MhmwT