File size: 5,477 Bytes
9159d27 20dc4fd 9159d27 20dc4fd 9159d27 20dc4fd 9159d27 20dc4fd 9159d27 20dc4fd 9159d27 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
---
base_model: unsloth/mistral-nemo-instruct-2407-bnb-4bit
license: apache-2.0
tags:
- Mistral_Star
- Mistral_Quiet
- Mistral
- Mixtral
- Question-Answer
- Token-Classification
- Sequence-Classification
- SpydazWeb-AI
- chemistry
- biology
- legal
- code
- climate
- medical
- text-generation-inference
language:
- en
- sw
- ig
- zu
- ca
- es
- pt
- ha
---
# Spydaz WEB AI
## Model Architecture
Mistral Nemo is a transformer model, with the following architecture choices:
- **Layers:** 40
- **Dim:** 5,120
- **Head dim:** 128
- **Hidden dim:** 14,436
- **Activation Function:** SwiGLU
- **Number of heads:** 32
- **Number of kv-heads:** 8 (GQA)
- **Vocabulary size:** 2**17 ~= 128k
- **Rotary embeddings (theta = 1M)**
- **Developed by:** LeroyDyer
- **License:** apache-2.0
- **Finetuned from model :** unsloth/mistral-nemo-instruct-2407-bnb-4bit
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
https://github.com/spydaz
# Introduction :
## STAR REASONERS !
this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output:
the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output:
so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response :
Another head could also be dedicated to retrieving content based on the query from the self which can also be used in the pregenerations stages :
all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways !
these chains produce data which can be considered to be thoughts : and can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training :
these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output :
these tokens can be displayed or with held also a setting in the model !
### can this be applied in other areas ?
Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags :
these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor !
it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !.
#### AI AGI ?
so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way )
### Conclusion
the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model :
the take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which in truth need to be autonomus !
ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also !
Fine tuning : agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model !
in fact it should as this give transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing !
hence an AGI !
# LOAD MODEL
```
! git clone https://github.com/huggingface/transformers.git
## copy modeling_mistral.py and configuartion.py to the Transformers foler / Src/models/mistral and overwrite the existing files first:
## THEN :
!cd transformers
!pip install ./transformers
```
then restaet the environment: the model can then load without trust-remote and WILL work FINE !
it can even be trained : hence the 4 bit optimised version ::
``` Python
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("LeroyDyer/_Spydaz_Web_AI_MistralStar_V2", trust_remote_code=True)
model.tokenizer = tokenizer
```
|