Bitext-Jamba-1.5-Mini-Banking-Customer-Support
Model Description
This model is version of ai21labs/AI21-Jamba-1.5-Mini fine-tuned on the Bitext Banking Customer Support Dataset dataset, which is specifically tailored for the Banking domain. It is optimized to answer questions and assist users with various banking transactions. It has been trained using hybrid synthetic data generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools.
The goal of this model is to show that a generic verticalized model makes customization for a final use case much easier. For example, if you are "ACME Bank", you can create your own customized model by using this fine-tuned model and doing an additional fine-tuning using a small amount of your own data. An overview of this approach can be found at: From General-Purpose LLMs to Verticalized Enterprise Models
Intended Use
- Recommended applications: This model is designed to be used as the first step in Bitext’s two-step approach to LLM fine-tuning for the creation of chatbots, virtual assistants and copilots for the Banking domain, providing customers with fast and accurate answers about their banking needs.
- Out-of-scope: This model is not suited for non-banking related questions and should not be used for providing health, legal, or critical safety advice.
Training Data
The model was fine-tuned on a dataset comprising various banking-related intents, including transactions like balance checks, money transfers, loan applications, and more, totaling 89 intents each represented by approximately 1000 examples. This comprehensive training helps the model address a broad spectrum of banking-related questions effectively. The dataset follows the same structured approach as our dataset published on Hugging Face as bitext/Bitext-customer-support-llm-chatbot-training-dataset, but with a focus on banking.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.7424 | 0.9983 | 299 | 0.7416 |
0.7224 | 2.0 | 599 | 0.7293 |
0.7153 | 2.9950 | 897 | 0.7288 |
Framework versions
- PEFT 0.13.2
- Transformers 4.45.0.dev0
- Pytorch 2.1.0+cu118
- Datasets 3.1.0
- Tokenizers 0.19.1
Limitations and Bias
- The model is trained for banking-specific contexts but may underperform in unrelated areas.
- Potential biases in the training data could affect the neutrality of the responses; users are encouraged to evaluate responses critically.
Ethical Considerations
It is important to use this technology thoughtfully, ensuring it does not substitute for human judgment where necessary, especially in sensitive financial situations.
Acknowledgments
This model was developed and trained by Bitext using proprietary data and technology.
License
This model, "Bitext-Jamba-1.5-Mini-Banking-Customer-Support", is licensed under the Jamba Open Model License, a permissive license allowing full research use and commercial use under the license terms. If you need to license the model for your needs, talk to us.
- Downloads last month
- 7
Model tree for bitext/Bitext-Jamba-1.5-Mini-Banking-Customer-Support
Base model
ai21labs/AI21-Jamba-1.5-Mini