DistilBERT Fine-Tuned for Named Entity Recognition (NER)
This repository contains a DistilBERT model fine-tuned for Named Entity Recognition (NER). The model has been trained to identify and classify named entities such as names of people, places, organizations, and dates in text.
Model Details
- Model: DistilBERT
- Task: Named Entity Recognition (NER)
- Training Dataset: Custom dataset
- Evaluation Metrics: Precision, Recall, F1-Score, Accuracy
Usage
You can use this model with the Hugging Face transformers
library to perform NER on your text data. Below are examples of how to use the model and tokenizer.
Installation
First, make sure you have the transformers
library installed:
pip install transformers
Load the Model
from transformers import pipeline
# Load the model and tokenizer
token_classifier = pipeline(
"token-classification",
model="cxx5208/NER_finetuned",
tokenizer="cxx5208/NER_finetuned",
aggregation_strategy="simple"
)
# Example text
text = "My name is Yeshvanth Raju Kurapati. I study at San Jose State University"
# Perform NER
entities = token_classifier(text)
print(entities)
Example Output
[
{'entity_group': 'PER',
'score': 0.99808735,
'word': 'Yeshvanth Raju Kurapati',
'start': 11,
'end': 34},
{'entity_group': 'ORG',
'score': 0.9923826,
'word': 'San Jose State University',
'start': 47,
'end': 72}
]
Training Details
The model was fine-tuned using the following hyperparameters:
- Batch Size: 16
- Learning Rate: 5e-5
- Epochs: 3
- Optimizer: AdamW
The training process involved using a standard NER dataset (e.g., CoNLL-2003) and included steps for tokenization, data preprocessing, and evaluation.
Evaluation
The model was evaluated using precision, recall, F1-score, and accuracy metrics. The performance metrics are as follows:
- Precision: 0.952
- Recall: 0.948
- F1-Score: 0.950
- Accuracy: 0.975
About DistilBERT
DistilBERT is a smaller, faster, cheaper version of BERT developed by Hugging Face. It retains 97% of BERT’s language understanding capabilities while being 60% faster and 40% smaller.
License
This model is released under the MIT License.
Acknowledgements
- Hugging Face for the transformers library and DistilBERT model.
- The authors of the original dataset used for training.
- Downloads last month
- 23