Kanonenbombe commited on
Commit
f717efc
·
verified ·
1 Parent(s): 1876617

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -40
README.md CHANGED
@@ -1,67 +1,63 @@
1
- ---
2
- library_name: transformers
3
- tags:
4
- - generated_from_trainer
5
- model-index:
6
- - name: llama3.2-1B-Function-calling
7
- results: []
8
- datasets:
9
- - Salesforce/xlam-function-calling-60k
10
- language:
11
- - en
12
- base_model:
13
- - meta-llama/Llama-3.2-1B
14
- ---
15
 
16
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
- should probably proofread and complete it, then remove this comment. -->
18
 
19
  # llama3.2-1B-Function-calling
20
 
21
- This model was trained from scratch on an unknown dataset.
22
- It achieves the following results on the evaluation set:
23
- - Loss: 0.1491
24
 
25
  ## Model description
26
 
27
- More information needed
28
 
29
  ## Intended uses & limitations
30
 
31
- More information needed
32
 
33
  ## Training and evaluation data
34
 
35
- More information needed
36
 
37
  ## Training procedure
38
 
39
  ### Training hyperparameters
40
 
41
- The following hyperparameters were used during training:
42
- - learning_rate: 2e-05
43
- - train_batch_size: 1
44
- - eval_batch_size: 1
45
- - seed: 42
46
- - gradient_accumulation_steps: 32
47
- - total_train_batch_size: 32
48
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
- - lr_scheduler_type: linear
50
- - num_epochs: 3
51
  - mixed_precision_training: Native AMP
52
 
53
  ### Training results
54
 
55
- | Training Loss | Epoch | Step | Validation Loss |
56
- |:-------------:|:------:|:----:|:---------------:|
57
- | 0.3083 | 0.9997 | 1687 | 0.3622 |
58
- | 0.202 | 2.0 | 3375 | 0.2844 |
59
  | 0.1655 | 2.9997 | 5061 | 0.1491 |
60
 
 
61
 
62
- ### Framework versions
63
 
64
- - Transformers 4.45.2
65
- - Pytorch 2.4.1+cu121
66
- - Datasets 3.0.1
67
  - Tokenizers 0.20.0
 
1
+ library_name: transformers
2
+ tags:
3
+ - generated_from_trainer
4
+ model-index:
5
+ - name: llama3.2-1B-Function-calling
6
+ results: []
7
+ datasets:
8
+ - Salesforce/xlam-function-calling-60k
9
+ language:
10
+ - en
11
+ base_model:
12
+ - meta-llama/Llama-3.2-1B
 
 
13
 
14
+ ---
 
15
 
16
  # llama3.2-1B-Function-calling
17
 
18
+ **⚠️ Important: This model is still under development and has not been fully fine-tuned. It is not yet suitable for use in production and should be treated as a work-in-progress. The results and performance metrics shared here are preliminary and subject to change.**
 
 
19
 
20
  ## Model description
21
 
22
+ This model was trained from scratch on an unknown dataset and is intended for function-calling tasks. As it is still in early stages, further development is required to optimize its performance.
23
 
24
  ## Intended uses & limitations
25
 
26
+ Currently, this model is not fully trained or optimized for any specific task. It is intended to handle function-calling tasks but should not be used in production until more comprehensive fine-tuning and evaluation are completed.
27
 
28
  ## Training and evaluation data
29
 
30
+ More information is needed regarding the dataset used for training. The model has not yet been fully evaluated, and additional testing is required to confirm its capabilities.
31
 
32
  ## Training procedure
33
 
34
  ### Training hyperparameters
35
 
36
+ The following hyperparameters were used during training:
37
+ - learning_rate: 2e-05
38
+ - train_batch_size: 1
39
+ - eval_batch_size: 1
40
+ - seed: 42
41
+ - gradient_accumulation_steps: 32
42
+ - total_train_batch_size: 32
43
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
+ - lr_scheduler_type: linear
45
+ - num_epochs: 3
46
  - mixed_precision_training: Native AMP
47
 
48
  ### Training results
49
 
50
+ | Training Loss | Epoch | Step | Validation Loss |
51
+ |:-------------:|:------:|:----:|:---------------:|
52
+ | 0.3083 | 0.9997 | 1687 | 0.3622 |
53
+ | 0.202 | 2.0 | 3375 | 0.2844 |
54
  | 0.1655 | 2.9997 | 5061 | 0.1491 |
55
 
56
+ These results are preliminary, and further training will be necessary to refine the model's performance.
57
 
58
+ ## Framework versions
59
 
60
+ - Transformers 4.45.2
61
+ - Pytorch 2.4.1+cu121
62
+ - Datasets 3.0.1
63
  - Tokenizers 0.20.0