Was this model trained on coding at all? Like deepseek coder?
#3
by
rombodawg
- opened
Im wondering what the human eval score is of this version? Is it any good?
One more question, does this model only support 4k context window? or does it go higher? Like 8k, 16k, 32k?
- This model uses a mixed dataset, including code data, but at a lower ratio compared to DeepSeek Coder.
- DeepSeek LLM 67B Chat achieves a HumanEval score of 73.8, more results here: https://github.com/deepseek-ai/deepseek-LLM#chat-model.
- Currently, the context window is 4K, but we're actively working to extend it. Stay tuned for further updates!
zdaxie
changed discussion status to
closed