A Finetuned Bloom 1b1 Model for Sequence Classification

The model was developed as a personal learning experience to fine tune a ready language model for Text Classification and to use it on real life data from the internet to perform sentiment analysis.

It has been generated using this raw template.

Model Details

The model achieves the following scores on the evaluation set during the fine tuning:

Screenshot 2024-01-03 at 16.08.46.png

Here is the train/ eval/ test split:

DatasetDict({
    train: Dataset({
        features: ['review', 'sentiment'],
        num_rows: 36000
    })
    test: Dataset({
        features: ['review', 'sentiment'],
        num_rows: 5000
    })
    eval: Dataset({
        features: ['review', 'sentiment'],
        num_rows: 9000
    })
})

Model Description

  • Developed by: Snoop088
  • Model type: Text Classification / Sequence Classification
  • Language(s) (NLP): English
  • License: Apache 2.0
  • **Finetuned from model: bigscience/bloom-1b1

Model Sources [optional]

Uses

The model is intended to be used for Text Classification.

Direct Use

Example script to use the model. Please note that this is peft adapter on the Bloom 1b model:

DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
model_name = 'snoop088/imdb_tuned-bloom1b1-sentiment-classifier'
loaded_model = AutoModelForSequenceClassification.from_pretrained(model_name, 
                                                                  trust_remote_code=True, 
                                                                  num_labels=2,
                                                                  device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

my_set = pd.read_csv("./data/df_manual.csv")

inputs = tokenizer(list(my_set["review"]), truncation=True, padding="max_length", max_length=256,  return_tensors="pt").to(DEVICE)
outputs = loaded_model(**inputs)
outcome = np.argmax(torch.Tensor.cpu(outputs.logits), axis=-1)

[More Information Needed]

Downstream Use [optional]

The purpose of this model is to be used to perform sentiment analysis on a dataset similar to the one by IMDB. It should work well on product reviews, too in my opinion.

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

Training is done on the IMDB dataset available on the Hub:

imdb

[More Information Needed]

Training Procedure

training_arguments = TrainingArguments(
    output_dir="your_tuned_model_name",
    save_strategy="epoch",
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    gradient_accumulation_steps=4,
    optim="adamw_torch",
    evaluation_strategy="steps",
    logging_steps=5,
    learning_rate=1e-5,
    max_grad_norm = 0.3,
    eval_steps=0.2,
    num_train_epochs=2,
    warmup_ratio= 0.1,
    # group_by_length=True,
    fp16=False,
    weight_decay=0.001,
    lr_scheduler_type="constant",
)

peft_model = get_peft_model(model, LoraConfig(
                            task_type="SEQ_CLS",
                            r=16,
                            lora_alpha=16,
                            target_modules=[
                                'query_key_value',
                                'dense'
                            ],
                            bias="none",
                            lora_dropout=0.05, # Conventional
                        ))

LORA results in: trainable params: 3,542,016 || all params: 1,068,859,392 || trainable%: 0.3313827830405592

Preprocessing [optional]

Simple preprocessing with DataCollator:

def process_data(example):
    item = tokenizer(example["review"], truncation=True, max_length=320) # see if this is OK for dyn padding
    item["labels"] = [ 1 if sent == 'positive' else 0 for sent in example["sentiment"]]
    return item

tokenised_data = tokenised_data.remove_columns(["review", "sentiment"])
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Evaluation function:

import evaluate

def compute_metrics(eval_pred):
    # All metrics are already predefined in the HF `evaluate` package
    precision_metric = evaluate.load("precision")
    recall_metric = evaluate.load("recall")
    f1_metric= evaluate.load("f1")
    accuracy_metric = evaluate.load("accuracy")

    logits, labels = eval_pred # eval_pred is the tuple of predictions and labels returned by the model
    predictions = np.argmax(logits, axis=-1)
    precision = precision_metric.compute(predictions=predictions, references=labels)["precision"]
    recall = recall_metric.compute(predictions=predictions, references=labels)["recall"]
    f1 = f1_metric.compute(predictions=predictions, references=labels)["f1"]
    accuracy = accuracy_metric.compute(predictions=predictions, references=labels)["accuracy"]
    # The trainer is expecting a dictionary where the keys are the metrics names and the values are the scores. 
    return {"precision": precision, "recall": recall, "f1-score": f1, 'accuracy': accuracy}

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

  • Model: 6.183.1 "13th Gen Intel(R) Core(TM) i9-13900K"
  • GPU: Nvidia RTX 4900/ 24 GB
  • Memory: 64 GB

Software

  • python 3.11.6
  • transformers 4.36.2
  • torch 2.1.2
  • peft 0.7.1
  • numpy 1.26.2
  • datasets 2.16.0

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month
3
Inference Examples
Inference API (serverless) does not yet support peft models for this pipeline type.

Dataset used to train snoop088/imdb_tuned-bloom1b1-sentiment-classifier