Triangle104's picture
Update README.md
eb1ab47 verified
metadata
library_name: transformers
license: mit
datasets:
  - MultivexAI/Everyday-Language-Corpus
language:
  - en
base_model: MultivexAI/Everyday-Language-3B
tags:
  - llama-cpp
  - gguf-my-repo

Triangle104/Everyday-Language-3B-Q4_K_S-GGUF

This model was converted to GGUF format from MultivexAI/Everyday-Language-3B using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.


Model details:

Everyday-Language-3B is a language model fine-tuned for generating natural, everyday English text. It builds upon a pre-trained 3 billion parameter base model (Llama-3.2-3B) and has been further trained on the Everyday-Language-Corpus dataset, a collection of over 8,700 examples of common phrases, questions, and statements encountered in daily interactions.

This fine-tuning process significantly improves the model's ability to produce coherent, contextually appropriate, and less repetitive text compared to its base version. It aims to better capture the nuances and patterns of typical conversational language. Intended Uses & Limitations

Intended Uses:

Generating natural language responses in conversational AI applications.
Creating more human-like text for creative writing or content generation.
Exploring the capabilities of language models in understanding and producing everyday language.
Serving as a foundation for further fine-tuning on specific downstream tasks.

Limitations:

Contextual Understanding: While improved, the model's contextual understanding is still limited by the size of its context window and the inherent complexities of language.
Potential Biases: Like all language models, Everyday-Language-3B may inherit biases from its pre-training data and the fine-tuning dataset. These biases can manifest in the generated text, potentially leading to outputs that reflect societal stereotypes or unfair assumptions.
Factuality: The model may generate text that is not factually accurate, especially when dealing with complex or nuanced topics. It's crucial to verify information generated by the model before relying on it.
Repetition: Although significantly reduced due to fine-tuning, the model may still exhibit some repetition in longer generated text.

Training Data

Everyday-Language-3B was fine-tuned on the Everyday-Language-Corpus dataset, which is publicly available on Hugging Face:

Dataset: MultivexAI/Everyday-Language-Corpus
Dataset Description: A collection of 8,787 synthetically generated examples of everyday English, structured as [S] {Sentence or Sentences} [E].
Dataset Focus: Common phrases, questions, and statements used in typical daily interactions.

Final loss: 1.143400


Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo Triangle104/Everyday-Language-3B-Q4_K_S-GGUF --hf-file everyday-language-3b-q4_k_s.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo Triangle104/Everyday-Language-3B-Q4_K_S-GGUF --hf-file everyday-language-3b-q4_k_s.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo Triangle104/Everyday-Language-3B-Q4_K_S-GGUF --hf-file everyday-language-3b-q4_k_s.gguf -p "The meaning to life and the universe is"

or

./llama-server --hf-repo Triangle104/Everyday-Language-3B-Q4_K_S-GGUF --hf-file everyday-language-3b-q4_k_s.gguf -c 2048