Triangle104
/

Llama3.1-8B-SEP-Chat-Q4_K_S-GGUF

@@ -15,6 +15,105 @@ tags:
 This model was converted to GGUF format from [`ruggsea/Llama3.1-8B-SEP-Chat`](https://huggingface.co/ruggsea/Llama3.1-8B-SEP-Chat) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/ruggsea/Llama3.1-8B-SEP-Chat) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`ruggsea/Llama3.1-8B-SEP-Chat`](https://huggingface.co/ruggsea/Llama3.1-8B-SEP-Chat) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/ruggsea/Llama3.1-8B-SEP-Chat) for more details on the model.
+---
+Model details:
+-
+This model is a LoRA finetune of meta-llama/Meta-Llama-3.1-8B trained on multi-turn philosophical conversations. It is designed to engage in philosophical discussions in a conversational yet rigorous manner, maintaining academic standards while being accessible.
+Model description
+The model was trained using the TRL (Transformer Reinforcement Learning) library's chat template, enabling it to handle multi-turn conversations in a natural way. It builds upon the capabilities of its predecessor Llama3-stanford-encyclopedia-philosophy-QA but extends it to handle more interactive, back-and-forth philosophical discussions.
+Chat Format
+The model uses the standard chat format with roles:
+<|system|>
+{{system_prompt}}
+<|user|>
+{{user_message}}
+<|assistant|>
+{{assistant_response}}
+Training Details
+The model was trained with the following system prompt:
+You are an expert and informative yet accessible Philosophy university professor. Students will engage with you in philosophical discussions. Respond to their questions and comments in a correct and rigorous but accessible way, maintaining academic standards while fostering understanding.
+Training hyperparameters
+The following hyperparameters were used during training:
+    Learning rate: 2e-5
+    Train batch size: 1
+    Gradient accumulation steps: 4
+    Effective batch size: 4
+    Optimizer: paged_adamw_8bit
+    LR scheduler: cosine with warmup
+    Warmup ratio: 0.03
+    Training epochs: 5
+    LoRA config:
+        r: 256
+        alpha: 128
+        Target modules: all-linear
+        Dropout: 0.05
+Framework versions
+    PEFT 0.10.0
+    Transformers 4.40.1
+    PyTorch 2.2.2+cu121
+    TRL latest
+    Datasets 2.19.0
+    Tokenizers 0.19.1
+Intended Use
+This model is designed for:
+    Multi-turn philosophical discussions
+    Academic philosophical inquiry
+    Teaching and learning philosophy
+    Exploring philosophical concepts through dialogue
+Limitations
+    The model should not be used as a substitute for professional philosophical advice or formal philosophical education
+    While the model aims to be accurate, its responses should be verified against authoritative sources
+    The model may occasionally generate plausible-sounding but incorrect philosophical arguments
+    As with all language models, it may exhibit biases present in its training data
+License
+This model is subject to the Meta Llama 2 license agreement. Please refer to Meta's licensing terms for usage requirements and restrictions.
+How to use
+Here's an example of how to use the model:
+from transformers import AutoModelForCausalLM, AutoTokenizer
+# Load model and tokenizer
+model = AutoModelForCausalLM.from_pretrained("ruggsea/Llama3.1-SEP-Chat")
+tokenizer = AutoTokenizer.from_pretrained("ruggsea/Llama3.1-SEP-Chat")
+# Example conversation
+messages = [
+    {"role": "user", "content": "What is the difference between ethics and morality?"}
+]
+# Format prompt using chat template
+prompt = tokenizer.apply_chat_template(
+    messages,
+    add_generation_prompt=True,
+    tokenize=False
+)
+# Generate response
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=512)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)