File size: 3,506 Bytes
deabed7 910c98a deabed7 910c98a deabed7 29f1c45 deabed7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
---
base_model:
- meta-llama/Llama-3.1-8B
datasets:
- ruggsea/stanford-encyclopedia-of-philosophy_chat_multi_turn
language:
- en
- it
license: other
---
# Llama3.1-SEP-Chat
This model is a LoRA finetune of meta-llama/Meta-Llama-3.1-8B trained on multi-turn philosophical conversations. It is designed to engage in philosophical discussions in a conversational yet rigorous manner, maintaining academic standards while being accessible.
## Model description
The model was trained using the TRL (Transformer Reinforcement Learning) library's chat template, enabling it to handle multi-turn conversations in a natural way. It builds upon the capabilities of its predecessor [Llama3-stanford-encyclopedia-philosophy-QA](https://huggingface.co/ruggsea/Llama3-stanford-encyclopedia-philosophy-QA) but extends it to handle more interactive, back-and-forth philosophical discussions.
### Chat Format
The model uses the standard chat format with roles:
```python
<|system|>
{{system_prompt}}
<|user|>
{{user_message}}
<|assistant|>
{{assistant_response}}
```
### Training Details
The model was trained with the following system prompt:
```
You are an expert and informative yet accessible Philosophy university professor. Students will engage with you in philosophical discussions. Respond to their questions and comments in a correct and rigorous but accessible way, maintaining academic standards while fostering understanding.
```
### Training hyperparameters
The following hyperparameters were used during training:
- Learning rate: 2e-5
- Train batch size: 1
- Gradient accumulation steps: 4
- Effective batch size: 4
- Optimizer: paged_adamw_8bit
- LR scheduler: cosine with warmup
- Warmup ratio: 0.03
- Training epochs: 5
- LoRA config:
- r: 256
- alpha: 128
- Target modules: all-linear
- Dropout: 0.05
### Framework versions
- PEFT 0.10.0
- Transformers 4.40.1
- PyTorch 2.2.2+cu121
- TRL latest
- Datasets 2.19.0
- Tokenizers 0.19.1
## Intended Use
This model is designed for:
- Multi-turn philosophical discussions
- Academic philosophical inquiry
- Teaching and learning philosophy
- Exploring philosophical concepts through dialogue
## Limitations
- The model should not be used as a substitute for professional philosophical advice or formal philosophical education
- While the model aims to be accurate, its responses should be verified against authoritative sources
- The model may occasionally generate plausible-sounding but incorrect philosophical arguments
- As with all language models, it may exhibit biases present in its training data
## License
This model is subject to the Meta Llama 2 license agreement. Please refer to Meta's licensing terms for usage requirements and restrictions.
## How to use
Here's an example of how to use the model:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("ruggsea/Llama3.1-SEP-Chat")
tokenizer = AutoTokenizer.from_pretrained("ruggsea/Llama3.1-SEP-Chat")
# Example conversation
messages = [
{"role": "user", "content": "What is the difference between ethics and morality?"}
]
# Format prompt using chat template
prompt = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=False
)
# Generate response
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
``` |