metadata
license: apache-2.0
tags:
- generated_from_trainer
- financial
- stocks
- sentiment
- sentiment-analysis
- financial-news
widget:
- text: >-
The company's quarterly earnings surpassed all estimates, indicating
strong growth.
datasets:
- financial_phrasebank
metrics:
- accuracy
model-index:
- name: AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: financial_phrasebank
type: financial_phrasebank
args: sentences_allagree
metrics:
- name: Accuracy
type: accuracy
value: 0.96688
language:
- en
base_model:
- distilbert/distilbert-base-uncased-finetuned-sst-2-english
pipeline_tag: text-classification
library_name: transformers
DistilBERT Fine-Tuned for Financial Sentiment Analysis
Model Description
This model is a fine-tuned version of distilbert-base-uncased specifically tailored for sentiment analysis in the financial domain. It has been trained on the Financial PhraseBank dataset to classify financial texts into three sentiment categories:
- Negative (label
0
) - Neutral (label
1
) - Positive (label
2
)
Model Performance
The model was trained for 5 epochs and evaluated on a held-out test set constituting 20 of the dataset.
Evaluation Metrics
Epoch | Eval Loss | Eval Accuracy |
---|---|---|
1 | 0.2210 | 94.26% |
2 | 0.1997 | 95.81% |
3 | 0.1719 | 96.69% |
4 | 0.2073 | 96.03% |
5 | 0.1941 | 96.69% |
Training Metrics
- Final Training Loss: 0.0797
- Total Training Time: Approximately 3869 seconds (~1.07 hours)
- Training Samples per Second: 2.34
- Training Steps per Second: 0.147
Training Procedure
Data
- Dataset: Financial PhraseBank
- Configuration:
sentences_allagree
(sentences where all annotators agreed on the sentiment) - Dataset Size: 2264 sentences
- Data Split: 80% training (1811 samples), 20% testing (453 samples)
Model Configuration
- Base Model: distilbert-base-uncased
- Number of Labels: 3 (negative, neutral, positive)
- Tokenizer: Same as the base model's tokenizer
Hyperparameters
- Number of Epochs: 5
- Batch Size: 16 (training), 64 (evaluation)
- Learning Rate: 5e-5
- Optimizer: AdamW
- Evaluation Metric: Accuracy
- Seed: 42 (for reproducibility)
Usage
You can load and use the model with the Hugging Face transformers
library as follows:
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis")
model = AutoModelForSequenceClassification.from_pretrained("AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis")
text = "The company's revenue declined significantly due to market competition."
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predicted_class_id = logits.argmax().item()
label_mapping = {0: "Negative", 1: "Neutral", 2: "Positive"}
predicted_label = label_mapping[predicted_class_id]
print(f"Text: {text}")
print(f"Predicted Sentiment: {predicted_label}")
License
This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute this model in your applications.
Citation
If you use this model in your research or applications, please cite it as:
@misc{AnkitAI_2024_financial_sentiment_model,
title={DistilBERT Fine-Tuned for Financial Sentiment Analysis},
author={Ankit Aglawe},
year={2024},
howpublished={\url{https://huggingface.co/AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis}},
}
Acknowledgments
- Hugging Face: For providing the Transformers library and model hosting.
- Data Providers: Thanks to the creators of the Financial PhraseBank dataset.
- Community: Appreciation to the open-source community for continual support and contributions.
Contact Information
For questions, feedback, or collaboration opportunities, please contact:
- Name: Ankit Aglawe
- Email: [[email protected]]
- GitHub: GitHub Profile
- LinkedIn: LinkedIn Profile