File size: 5,202 Bytes
f48ce8a c01b227 f48ce8a c01b227 f48ce8a c01b227 3f4e984 52c0dcc 3f4e984 ff6646d f48ce8a ff6646d f48ce8a c01b227 ff6646d f48ce8a c01b227 2f3cd93 c01b227 f48ce8a c01b227 f48ce8a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
---
license:
- apache-2.0
- cc-by-nc-4.0
datasets: pszemraj/fleece2instructions-codealpaca
tags:
- generated_from_trainer
- instruct
- instructions
- code
metrics:
- rouge
language:
- en
widget:
- text: >
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
sequences = ["I've been waiting for a HuggingFace course my whole life.",
"So have I!"]
tokens = tokenizer(sequences, padding=True, truncation=True,
return_tensors="pt")
output = model(**tokens)
example_title: Example One
- text: >
import torch
from tqdm.auto import tqdm
device = torch.device("cuda") if torch.cuda.is_available() else
torch.device("cpu")
model.to(device)
progress_bar = tqdm(range(num_training_steps))
model.train()
for epoch in range(num_epochs):
for batch in train_dataloader:
batch = {k: v.to(device) for k, v in batch.items()}
outputs = model(**batch)
loss = outputs.loss
loss.backward()
optimizer.step()
lr_scheduler.step()
optimizer.zero_grad()
progress_bar.update(1)
example_title: Example Two
- text: |
import evaluate
metric = evaluate.load("glue", "mrpc")
model.eval()
for batch in eval_dataloader:
batch = {k: v.to(device) for k, v in batch.items()}
with torch.no_grad():
outputs = model(**batch)
logits = outputs.logits
predictions = torch.argmax(logits, dim=-1)
metric.add_batch(predictions=predictions, references=batch["labels"])
metric.compute()
example_title: Example Three
- text: |
git lfs install
huggingface-cli lfs-enable-largefiles .
git lfs track "*.bin"
git add .
git commit -a -m "add fp32 chkpt"
git push
example_title: Example Four
- text: |
export interface DocumentParams {
pageContent: string;
// eslint-disable-next-line @typescript-eslint/no-explicit-any
metadata: Record<string, any>;
}
/**
* Interface for interacting with a document.
*/
export class Document implements DocumentParams {
pageContent: string;
// eslint-disable-next-line @typescript-eslint/no-explicit-any
metadata: Record<string, any>;
constructor(fields?: Partial<DocumentParams>) {
this.pageContent = fields?.pageContent ?? this.pageContent;
this.metadata = fields?.metadata ?? {};
}
}
example_title: Example Five
inference:
parameters:
max_length: 96
num_beams: 4
---
# bart-base-code-instructiongen
Use this text2text model to find out what LLM instructions might be able to generate an arbitary piece of code!
This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on the `pszemraj/fleece2instructions-codealpaca` dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0136
- Rouge1: 59.9513
- Rouge2: 33.9118
- Rougel: 55.7815
- Rougelsum: 56.9064
- Gen Len: 29.7146
## Intended uses & limitations
🚨 **note:** as the authors elected to release the [original dataset](https://github.com/sahil280114/codealpaca) under `cc-by-nc`, the license carries over to this model and **cannot be used for commercial activity**.
> This is just a `base` size model, which does a decent job for its size, but is not perfect. For better quality instructions, check out [bart-large](https://huggingface.co/pszemraj/bart-large-code-instructiongen) or fine tune your own larger model on the dataset :)
Intended use: Research on domain adaptation and/or other improvements to LLMs by extending instruction:text data pairs.
## Training and evaluation data
Refer to the linked dataset card for `pszemraj/fleece2instructions-codealpaca` or the [original dataset](https://github.com/sahil280114/codealpaca) repo.
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 8e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.02
- num_epochs: 3.0
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 1.1165 | 1.0 | 281 | 1.1090 | 57.9239 | 31.9259 | 53.8737 | 54.9811 | 28.2924 |
| 1.0763 | 2.0 | 563 | 1.0267 | 59.9605 | 34.0298 | 55.7523 | 56.8021 | 29.6966 |
| 0.9595 | 2.99 | 843 | 1.0136 | 59.9513 | 33.9118 | 55.7815 | 56.9064 | 29.7146 |
|