AnkitAI commited on
Commit
0bb179a
·
verified ·
1 Parent(s): b5ec275

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +127 -3
README.md CHANGED
@@ -1,3 +1,127 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - generated_from_trainer
5
+ - financial
6
+ - stocks
7
+ - sentiment
8
+ - sentiment-analysis
9
+ - financial-news
10
+ widget:
11
+ - text: The company's quarterly earnings surpassed all estimates, indicating strong growth.
12
+ datasets:
13
+ - financial_phrasebank
14
+ metrics:
15
+ - accuracy
16
+ model-index:
17
+ - name: AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis
18
+ results:
19
+ - task:
20
+ name: Text Classification
21
+ type: text-classification
22
+ dataset:
23
+ name: financial_phrasebank
24
+ type: financial_phrasebank
25
+ args: sentences_allagree
26
+ metrics:
27
+ - name: Accuracy
28
+ type: accuracy
29
+ value: 0.96688
30
+ language:
31
+ - en
32
+ base_model:
33
+ - distilbert/distilbert-base-uncased-finetuned-sst-2-english
34
+ pipeline_tag: text-classification
35
+ library_name: transformers
36
+ ---
37
+ # DistilBERT Fine-Tuned for Financial Sentiment Analysis
38
+ ## Model Description
39
+
40
+ This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) specifically tailored for sentiment analysis in the financial domain. It has been trained on the [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank) dataset to classify financial texts into three sentiment categories:
41
+
42
+ - Negative (label `0`)
43
+ - Neutral (label `1`)
44
+ - Positive (label `2`)
45
+
46
+ ## Model Performance
47
+ The model was trained for 5 epochs and evaluated on a held-out test set constituting 20 of the dataset.
48
+
49
+ ### Evaluation Metrics
50
+ | Epoch | Eval Loss | Eval Accuracy |
51
+ |-----------|---------------|-------------------|
52
+ | 1 | 0.2210 | 94.26% |
53
+ | 2 | 0.1997 | 95.81% |
54
+ | 3 | 0.1719 | 96.69% |
55
+ | 4 | 0.2073 | 96.03% |
56
+ | 5 | 0.1941 | **96.69%** |
57
+
58
+ Final Evaluation Accuracy**: **96.69%**
59
+
60
+ ### Training Metrics
61
+ - **Final Training Loss**: 0.0797
62
+ - **Total Training Time**: Approximately 3869 seconds (~1.07 hours)
63
+ - **Training Samples per Second**: 2.34
64
+ - **Training Steps per Second**: 0.147
65
+
66
+ ## Training Procedure
67
+ ### Data
68
+ - **Dataset**: [Financial PhraseBank](https://huggingface.co/datasets/financial_phrasebank)
69
+ - **Configuration**: `sentences_allagree` (sentences where all annotators agreed on the sentiment)
70
+ - **Dataset Size**: 2264 sentences
71
+ - **Data Split**: 80% training (1811 samples), 20% testing (453 samples)
72
+
73
+ ### Model Configuration
74
+ - **Base Model**: [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
75
+ - **Number of Labels**: 3 (negative, neutral, positive)
76
+ - **Tokenizer**: Same as the base model's tokenizer
77
+
78
+ ### Hyperparameters
79
+ - **Number of Epochs**: 5
80
+ - **Batch Size**: 16 (training), 64 (evaluation)
81
+ - **Learning Rate**: 5e-5
82
+ - **Optimizer**: AdamW
83
+ - **Evaluation Metric**: Accuracy
84
+ - **Seed**: 42 (for reproducibility)
85
+
86
+ ## Usage
87
+ You can load and use the model with the Hugging Face `transformers` library as follows:
88
+ ```python
89
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
90
+
91
+ tokenizer = AutoTokenizer.from_pretrained('AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis')
92
+ model = AutoModelForSequenceClassification.from_pretrained('AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis')
93
+
94
+ text = "The company's quarterly earnings surpassed all estimates, indicating strong growth."
95
+ inputs = tokenizer(text, return_tensors="pt")
96
+
97
+ outputs = model(**inputs)
98
+ predictions = outputs.logits.argmax(dim=-1)
99
+
100
+ label_mapping = {0: 'Negative', 1: 'Neutral', 2: 'Positive'}
101
+ print(f"Sentiment: {label_mapping[predictions.item()]}")
102
+ ```
103
+
104
+ ## License
105
+ This model is licensed under the **Apache 2.0 License**. You are free to use, modify, and distribute this model in your applications.
106
+
107
+ ## Citation
108
+ If you use this model in your research or applications, please cite it as:
109
+ ```
110
+ @misc{AnkitAI_2024_financial_sentiment_model,
111
+ title={DistilBERT Fine-Tuned for Financial Sentiment Analysis},
112
+ author={Ankit Aglawe},
113
+ year={2024},
114
+ howpublished={\url{https://huggingface.co/AnkitAI/distilbert-base-uncased-financial-news-sentiment-analysis}},
115
+ }
116
+ ```
117
+ ## Acknowledgments
118
+ - **Hugging Face**: For providing the Transformers library and model hosting.
119
+ - **Data Providers**: Thanks to the creators of the Financial PhraseBank dataset.
120
+ - **Community**: Appreciation to the open-source community for continual support and contributions.
121
+
122
+ ## Contact Information
123
+ For questions, feedback, or collaboration opportunities, please contact:
124
+ - **Name**: Ankit Aglawe
125
+ - **Email**: [[email protected]]
126
+ - **GitHub**: [GitHub Profile](https://github.com/ankit-aglawe)
127
+ - **LinkedIn**: [LinkedIn Profile](https://www.linkedin.com/in/ankit-aglawe)