bartowski's picture
Update README.md
f1ba1eb verified
|
raw
history blame
4.11 kB
metadata
license: gemma
tags:
  - alignment-handbook
  - generated_from_trainer
datasets:
  - argilla/dpo-mix-7k
model-index:
  - name: DiscoPOP-zephyr-7b-gemma
    results: []
quantized_by: bartowski
pipeline_tag: text-generation
lm_studio:
  param_count: 8b
  use_case: general
  release_date: 13-06-2024
  model_creator: SakanaAI
  prompt_template: ChatML
  system_prompt: none
  base_model: gemma
  original_repo: SakanaAI/DiscoPOP-zephyr-7b-gemma
base_model: SakanaAI/DiscoPOP-zephyr-7b-gemma

πŸ’« Community Model> DiscoPOP-zephyr-7b-gemma by Sakana AI

πŸ‘Ύ LM Studio Community models highlights program. Highlighting new & noteworthy models by the community. Join the conversation on Discord.

Model creator: Sakana AI
Original model: DiscoPOP-zephyr-7b-gemma
GGUF quantization: provided by bartowski based on llama.cpp release b3145

Model Summary:

This model is based on the Zephyr 7b Gemma model, trained with a brand new technique called DiscoPOP. DiscoPOP is Sakana AI's Discovered Preference Optimization algorithm.
This training method is brand new, discovered through experimental model prompting to discover new bespoke training techniques.

Prompt template:

Choose the ChatML preset in your LM Studio.

Under the hood, the model will see a prompt that's formatted like so:

<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

This model is not tuned for a system prompt.

Technical Details

DiscoPOP was discovered through a new method proposed by Sakana AI. In this method, an LLM is prompted to propose and implement new preference optimization loss functions based on previously-evaluated performance metrics.
This process leads to the discovery of previously-unknown preference optimization algorithms. DiscoPOP represents the best performing discovered preference optimizer.

DiscoPOP achieves a higher score (Reward) while deviating less from the base model (KL Divergence), compared to existing state-of-the-art methods such as DPO.

For a deeper analysis and additional details, you can read their blog post here: https://sakana.ai/llm-squared/

Special thanks

πŸ™ Special thanks to Georgi Gerganov and the whole team working on llama.cpp

πŸ™ Special thanks to Kalomaze and Dampf for their work on the dataset (linked here) that was used for calculating the imatrix for all sizes.

Disclaimers

LM Studio is not the creator, originator, or owner of any Model featured in the Community Model Program. Each Community Model is created and provided by third parties. LM Studio does not endorse, support, represent or guarantee the completeness, truthfulness, accuracy, or reliability of any Community Model. You understand that Community Models can produce content that might be offensive, harmful, inaccurate or otherwise inappropriate, or deceptive. Each Community Model is the sole responsibility of the person or entity who originated such Model. LM Studio may not monitor or control the Community Models and cannot, and does not, take responsibility for any such Model. LM Studio disclaims all warranties or guarantees about the accuracy, reliability or benefits of the Community Models. LM Studio further disclaims any warranty that the Community Model will meet your requirements, be secure, uninterrupted or available at any time or location, or error-free, viruses-free, or that any errors will be corrected, or otherwise. You will be solely responsible for any damage resulting from your use of or access to the Community Models, your downloading of any Community Model, or use of any other Community Model provided by or through LM Studio.