YaRN block required?

by robbiemu - opened 2 days ago

2 days ago

I noticed that the config.json here has a 128k context size (like you might have with the yarn settings enabled for Qwen 2.5 models) but no yarn specific config like:

  "rope_scaling": {
    "factor": 4.0,
    "original_max_position_embeddings": 32768,
    "type": "yarn"
  }

I imagine we should add these, because you did not in fact change the original max_positional_embeddings, right?

hackeys

1 day ago

Good question!
Also please tell me, after quantization in gguf, the maximum size will be 32k or the one specified in the max_position_embeddings parameter?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment