prithivMLmods commited on
Commit
e164953
·
verified ·
1 Parent(s): 5d0e8c7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -1
README.md CHANGED
@@ -12,4 +12,99 @@ tags:
12
  - qwen2.5
13
  - cot
14
  - lcot
15
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  - qwen2.5
13
  - cot
14
  - lcot
15
+ ---
16
+
17
+ # **Taurus-Opus-7B-Elite**
18
+
19
+ Taurus-Opus-7B-Elite is based on a 7B-parameter architecture inspired by Qwen 2.5, optimized to deliver exceptional reasoning, contextual understanding, and problem-solving capabilities. It has been fine-tuned with a focus on chain-of-thought (CoT) reasoning using a specialized dataset for tasks requiring logical deductions and multi-step problem-solving. Despite its reduced parameter count, Taurus-Opus-7B-Elite remains highly efficient and versatile, tailored for a range of applications such as instruction-following, structured data processing, and multilingual tasks.
20
+
21
+ # **Key Improvements**
22
+
23
+ 1. **Compact Yet Powerful**:
24
+ Despite being a 7B-parameter model, Taurus-Opus demonstrates powerful reasoning and understanding capabilities comparable to larger models due to advanced optimization techniques.
25
+
26
+ 2. **Enhanced Efficiency**:
27
+ Optimized for faster inference and reduced computational costs, making it suitable for deployments on devices with limited resources.
28
+
29
+ 3. **Instruction Following**:
30
+ Improved capabilities in understanding and executing complex instructions while generating long texts (up to 4K tokens).
31
+
32
+ 4. **Structured Data Processing**:
33
+ Excels at analyzing tables, JSON, and other structured data formats, ensuring accurate and structured outputs.
34
+
35
+ 5. **Multilingual Proficiency**:
36
+ Supports 20+ languages, maintaining accuracy and fluency in common languages such as English, Chinese, Spanish, and French.
37
+
38
+ 6. **Streamlined Long-Context Support**:
39
+ Supports up to 64K tokens, providing robust contextual understanding for long-chain reasoning tasks.
40
+
41
+ # **Quickstart with transformers**
42
+
43
+ ```python
44
+ from transformers import AutoModelForCausalLM, AutoTokenizer
45
+
46
+ model_name = "prithivMLmods/Taurus-Opus-7B-Elite"
47
+
48
+ model = AutoModelForCausalLM.from_pretrained(
49
+ model_name,
50
+ torch_dtype="auto",
51
+ device_map="auto"
52
+ )
53
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
54
+
55
+ prompt = "Explain why reasoning is critical in solving complex problems."
56
+ messages = [
57
+ {"role": "system", "content": "You are Taurus, an advanced AI assistant optimized for reasoning and problem-solving."},
58
+ {"role": "user", "content": prompt}
59
+ ]
60
+ text = tokenizer.apply_chat_template(
61
+ messages,
62
+ tokenize=False,
63
+ add_generation_prompt=True
64
+ )
65
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
66
+
67
+ generated_ids = model.generate(
68
+ **model_inputs,
69
+ max_new_tokens=256
70
+ )
71
+ generated_ids = [
72
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
73
+ ]
74
+
75
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
76
+ ```
77
+
78
+ # **Intended Use**
79
+
80
+ 1. **Reasoning and Contextual Understanding**:
81
+ Tailored for tasks that require logical deductions and contextual analysis, suitable for educational and professional use cases.
82
+
83
+ 2. **Mathematical Reasoning**:
84
+ Adept at solving mathematical problems and calculations, making it ideal for STEM applications.
85
+
86
+ 3. **Code Assistance**:
87
+ Provides support for generating, debugging, and optimizing code in a variety of programming languages.
88
+
89
+ 4. **Multilingual Tasks**:
90
+ Enables global applications, including multilingual content generation, translation, and conversational AI.
91
+
92
+ 5. **Content Generation**:
93
+ Generates high-quality long-form text for reports, articles, and other professional documents.
94
+
95
+ # **Limitations**
96
+
97
+ 1. **Reduced Parameter Count**:
98
+ While efficient, it may not achieve the same depth of understanding as larger models like 14B-parameter counterparts in some complex tasks.
99
+
100
+ 2. **Hardware Requirements**:
101
+ Though lighter than larger models, it still requires a GPU or high-performance CPU for optimal performance.
102
+
103
+ 3. **Multilingual Accuracy**:
104
+ Performance may vary for less-resourced languages, with minor inaccuracies in nuanced translations.
105
+
106
+ 4. **Error Propagation in Long Outputs**:
107
+ Similar to larger models, early output errors in long-text generation can affect the coherence of the final text.
108
+
109
+ 5. **Prompt Sensitivity**:
110
+ Requires well-structured prompts for best performance, necessitating some user familiarity with prompt design.