DavidAU
/

Gemma-The-Writer-9B-GGUF

Model card Files Files and versions Community

DavidAU commited on Oct 21, 2024

Commit

d8ec365

·

verified ·

1 Parent(s): 404f2cb

Update README.md

Files changed (1) hide show

README.md +44 -0

README.md CHANGED Viewed

@@ -45,6 +45,50 @@ Recommended Rep Pen of 1.05 or higher, temp range 0-5.
 Example outputs below.
 <B>Other Versions of "Gemma The Writer": </B>
 The second version of this model is "Deadline" at 10B parameters. It is a specially modified version that changes

 Example outputs below.
+<B>Settings, Quants and Critical Operations Notes:</b>
+This model has been modified ("Brainstorm") to alter prose output, and generally outputs longer text than average.
+Change in temp (ie, .4, .8, 1.5, 2, 3 ) will drastically alter output.
+Rep pen settings will also alter output too.
+This model needs "rep pen" of 1.02 or higher.
+For role play: Rep pen of 1.05 to 1.08 is suggested.
+Raise/lower rep pen SLOWLY ie: 1.011, 1.012 ...
+Rep pen will alter prose, word choice (lower rep pen=small words / more small word - sometimes) and creativity.
+To really push the model:
+Rep pen 1.05 or lower / Temp 3+ ... be ready to stop the output because it may go and go at these strong settings.
+You can also set a "hard stop" - maximum tokens generation - too to address lower rep pen settings / high creativity settings.
+Longer prompts vastly increase the quality of the model's output.
+QUANT CHOICE(S):
+Higher quants will have more detail, nuance and in some cases stronger "emotional" levels. Characters will also be
+more "fleshed out" too. Sense of "there" will also increase.
+Q4KM/Q4KS are good, strong quants however if you can run Q5, Q6 or Q8 - go for the highest quant you can.
+This repo also has 3 "ARM" quants for computers that support this quant. If you use these on a "non arm" machine token per second will be very low.
+IQ4XS: Due to the unusual nature of this quant (mixture/processing), generations from it will be different then other quants.
+You may want to try it / compare it to other quant(s) output.
+Special note on Q2k/Q3 quants:
+You may need to use temp 2 or lower with these quants (1 or lower for q2k). Just too much compression at this level, damaging the model. I will see if Imatrix versions
+of these quants will function better.
+Rep pen adjustments may also be required to get the most out of this model at this/these quant level(s).
 <B>Other Versions of "Gemma The Writer": </B>
 The second version of this model is "Deadline" at 10B parameters. It is a specially modified version that changes