cxfajar197 commited on
Commit
250cd40
·
verified ·
1 Parent(s): 631f287

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -27
README.md CHANGED
@@ -19,21 +19,9 @@ language:
19
 
20
  <!-- Provide a quick summary of what the model is/does. -->
21
 
22
- This is an Urdu OCR model designed for handwriting recognition tasks. It utilizes a VisionEncoderDecoderModel with a ViT-based encoder and a BERT-based decoder, fine-tuned on a custom dataset for robust and accurate text extraction from images.
23
 
24
- ## Model Details
25
 
26
- ### Model Description
27
-
28
- <!-- Provide a longer summary of what this model is. -->
29
-
30
- This model leverages the combination of a Vision Transformer (ViT) encoder (`facebook/deit-base-distilled-patch16-384`) and a multilingual BERT decoder (`bert-base-multilingual-cased`) to perform OCR tasks in Urdu. The model is fine-tuned on a dataset of 46,742 image-text pairs, using advanced data augmentation techniques to improve robustness.
31
-
32
- - **Developed by:** Fajar Pervaiz
33
- - **Funded by:** [More Information Needed]
34
- - **Shared by:** [More Information Needed]
35
- - **Model type:** VisionEncoderDecoderModel
36
- - **Language(s) (NLP):** Urdu (`ur`)
37
 
38
 
39
 
@@ -86,10 +74,6 @@ print("Generated Text:", generated_text)
86
 
87
  The model was tested on handwritten text images with varying font styles and complexities.
88
 
89
- #### Metrics
90
-
91
- - **Character Error Rate (CER):** [Value Needed]
92
- - **Word Error Rate (WER):** [Value Needed]
93
 
94
 
95
 
@@ -102,11 +86,8 @@ The model achieves competitive accuracy on Urdu handwritten text recognition tas
102
  - **Hardware Type:** NVIDIA GPU
103
 
104
 
105
- ## Technical Specifications
106
 
107
- ### Model Architecture and Objective
108
 
109
- The model uses a VisionEncoderDecoder architecture combining a ViT encoder and a BERT decoder.
110
 
111
  ### Compute Infrastructure
112
 
@@ -133,15 +114,8 @@ Python, PyTorch, Hugging Face Transformers
133
 
134
 
135
 
136
- ## Glossary
137
-
138
- - **CER:** Character Error Rate
139
- - **WER:** Word Error Rate
140
- - **OCR:** Optical Character Recognition
141
 
142
- ## More Information
143
 
144
- [More Information Needed]
145
 
146
  ## Model Card Authors
147
 
 
19
 
20
  <!-- Provide a quick summary of what the model is/does. -->
21
 
22
+ This model, cxfajar197/urdu-ocr, is trained on Urdu data specifically designed for OCR tasks. It works best with single-line Urdu text images, primarily focusing on printed text. The model is optimized for extracting accurate Urdu text from such images and can be easily utilized using the Hugging Face pipeline API.
23
 
 
24
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
 
27
 
 
74
 
75
  The model was tested on handwritten text images with varying font styles and complexities.
76
 
 
 
 
 
77
 
78
 
79
 
 
86
  - **Hardware Type:** NVIDIA GPU
87
 
88
 
 
89
 
 
90
 
 
91
 
92
  ### Compute Infrastructure
93
 
 
114
 
115
 
116
 
 
 
 
 
 
117
 
 
118
 
 
119
 
120
  ## Model Card Authors
121