Jim Lai's picture

Jim Lai

grimjim

AI & ML interests

Experimenting primarily with 7B-12B parameter text completion models. Not all models are intended for direct use, but aim for educational and/or merge purposes.

Recent Activity

Organizations

Social Post Explorers's profile picture Debased AI's profile picture Anthracite's profile picture Anthracite Core's profile picture

Posts 17

view post
Post
2427
I'm (finally) releasing a Python script that trims excess weights in Gemma2 full-weight models that bloated by ~1B parameters due to an early mergekit bug.
https://github.com/jim-plus/Gemma2-mergekit-remediation

I'd noticed something was off when merges of Gemma2 9B models ended up having ~10B parameters. The current mergekit package is fine, but there are still bloated models on HF that could stand to be fixed.

The script assumes that it will be run from the same directory as the model weights, and will trim the unnecessary lm_head.weight tensor and corresponding index entry.
view post
Post
1388
A reminder that literal base models are valid choices for base model in task arithmetic mergers. Each Instruct or fine-tuned model then becomes a vector against the base model. Example merge formula used can be found via this model page.
grimjim/Magnolia-v3-12B