huskylm-2.5-8b / README.md
AMDBartek's picture
Typo fix
f8ae3b1
|
raw
history blame
6.82 kB
metadata
language:
  - en
pipeline_tag: text-generation
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-3
  - huskylm
  - darkcloudai
datasets:
  - darkcloudai-smallmodel-frontieredition
  - darkcloudai-webdriver-redditcrawl-2023
  - darkcloudai-unalignment-truthfulness
  - darkcloudai-generaldpo
  - ai2_arc
  - allenai/ultrafeedback_binarized_cleaned
  - argilla/distilabel-intel-orca-dpo-pairs
  - jondurbin/airoboros-3.2
  - codeparrot/apps
  - facebook/belebele
  - bluemoon-fandom-1-1-rp-cleaned
  - boolq
  - camel-ai/biology
  - camel-ai/chemistry
  - camel-ai/math
  - camel-ai/physics
  - jondurbin/contextual-dpo-v0.1
  - jondurbin/gutenberg-dpo-v0.1
  - jondurbin/py-dpo-v0.1
  - jondurbin/truthy-dpo-v0.1
  - LDJnr/Capybara
  - jondurbin/cinematika-v0.1
  - WizardLM/WizardLM_evol_instruct_70k
  - glaiveai/glaive-function-calling-v2
  - jondurbin/gutenberg-dpo-v0.1
  - grimulkan/LimaRP-augmented
  - lmsys/lmsys-chat-1m
  - ParisNeo/lollms_aware_dataset
  - TIGER-Lab/MathInstruct
  - Muennighoff/natural-instructions
  - openbookqa
  - kingbri/PIPPA-shareGPT
  - piqa
  - Vezora/Tested-22k-Python-Alpaca
  - ropes
  - cakiki/rosetta-code
  - Open-Orca/SlimOrca
  - b-mc2/sql-create-context
  - squad_v2
  - mattpscott/airoboros-summarization
  - migtissera/Synthia-v1.3
  - unalignment/toxic-dpo-v0.2
  - WhiteRabbitNeo/WRN-Chapter-1
  - WhiteRabbitNeo/WRN-Chapter-2
  - winogrande

huskylm-2.5-8b

HuskyLM 2 Logo

Darkcloud AI: Making AI for the people, not money for investors.

Built with Meta Llama 3 - full model name: llama-3-huskylm-2.5-8b - model slug: huskylm-2.5-8b

Description

HuskyLM 2.5 represents the third iteration in the HuskyLM-n series, developed through a unique combination of Direct Preference Optimization (DPO) and proprietary Reinforcement Learning from Artificial Intelligence Feedback (RLAIF) methodology made by Darkcloud AI. Of course, the model also underwent traditional SFT (Supervised Fine-Tuning) before any of the fancy stuff.

Like its predecessors in the HuskyLM-n series, HuskyLM 2.5 is a high-performance, general-purpose chat model designed to provide robust adherence to system prompts. This versatility enables its utilization across a broad range of applications, including assistance with tasks such as writing and coding, live chat agent functions, in-game non-player character (NPC) interactions, and more. In addition to this, HuskyLM 2.5 has been designed to have a friendly and trustworthy tone by default (which comes from the private darkcloudai-smallmodel-frontieredition dataset) meaning that users will always have a pleasant experience when engaging with the model.

At Darkcloud AI, we prioritized meticulous attention to detail when developing our model. We've taken great strides to ensure its impartiality, guaranteeing that it provides truthful information in everyday interactions as best as we can. As AI systems become increasingly integral to daily life, it's crucial that users can rely on them as trusted sources of information.

A quick overview of the model's strengths include:

  • State-of-the-art coding abilities in many programming/markup languages such as Python, JavaScript, Markdown, HTML, C#, Rust, Java, and more
  • Exceptional conversational abilities
  • Strong system prompt adherence
  • Best-in-class multilingual capabilities compared to competing models of its size (English, Chinese, Spanish, French, Polish, and more!)
  • Being unbiased and truthful (although, you should note that all forms of intelligence can and WILL make mistakes, whether organic or artificial)
  • Having no unnecessary censorship (some unfortunately bleeds through since Meta-Llama-3-8B-Instruct was used as a base and HuskyLM 3 should fix that - we're training from the ground up from the base Llama 3 next time)
  • Simply being fun to talk to

Model Specs

  • Size: 8 billion parameters
  • Knowledge cutoff: December 2023
  • Context window: 8192 tokens (we're working on expanding that massively to 128k! wish us luck?)
  • Base model: meta-llama/Meta-Llama-3-8B-Instruct
  • Model license: Llama 3 License
  • Architecture: Llama 3

Trained on Benchmarks?

Well, yes, but actually no. You may see the names of benchmarks in the datasets used, however only train splits were used. If you don't know the difference, please learn.

Quants and Other Formats

Huge Thank You to the Following People/Companies

  • Meta AI: This model would never have been possible if Meta AI did not release Llama 3 with an open license. We thank them deeply for making frontier LLMs available for all.
  • Jon Durbin: We've used many of his datasets to train this model, specifically airoboros-3.2, contextual-dpo-v0.1, gutenberg-dpo-v0.1, py-dpo-v0.1, truthy-dpo-v0.1, cinematika-v0.1, gutenberg-dpo-v0.1. His work is amazing and we thank him a lot. We've used a lot of datasets for our model that he used for his bagel series of models too. If you couldn't already guess, this model is essentially a bagel-type model but with our custom datasets and RLAIF methodology added in.
  • Hugging Face: Throughout Darkcloud AI's life, we've extensively used and relied on libraries made by HuggingFace and we thank them and everyone who has contributed.
  • Axolotl: We've used Axolotl to streamline the (SFT) fine-tuning of our LLMs. Huge thank you to them and every contributor.
  • You: That's right! You, the user. We value every single bit of feedback we receive from our users as it helps us to make our models better for everyone. If you have any issues, please give feedback. Every little bit of information helps, no matter how minor the issue or question you have is!
  • All other Dataset Creators/Contributors: If you have created or contributed to any dataset shown in the model card, we thank you deeply for providing high-quality data for all.