cxfajar197
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -19,21 +19,9 @@ language:
|
|
19 |
|
20 |
<!-- Provide a quick summary of what the model is/does. -->
|
21 |
|
22 |
-
This is
|
23 |
|
24 |
-
## Model Details
|
25 |
|
26 |
-
### Model Description
|
27 |
-
|
28 |
-
<!-- Provide a longer summary of what this model is. -->
|
29 |
-
|
30 |
-
This model leverages the combination of a Vision Transformer (ViT) encoder (`facebook/deit-base-distilled-patch16-384`) and a multilingual BERT decoder (`bert-base-multilingual-cased`) to perform OCR tasks in Urdu. The model is fine-tuned on a dataset of 46,742 image-text pairs, using advanced data augmentation techniques to improve robustness.
|
31 |
-
|
32 |
-
- **Developed by:** Fajar Pervaiz
|
33 |
-
- **Funded by:** [More Information Needed]
|
34 |
-
- **Shared by:** [More Information Needed]
|
35 |
-
- **Model type:** VisionEncoderDecoderModel
|
36 |
-
- **Language(s) (NLP):** Urdu (`ur`)
|
37 |
|
38 |
|
39 |
|
@@ -86,10 +74,6 @@ print("Generated Text:", generated_text)
|
|
86 |
|
87 |
The model was tested on handwritten text images with varying font styles and complexities.
|
88 |
|
89 |
-
#### Metrics
|
90 |
-
|
91 |
-
- **Character Error Rate (CER):** [Value Needed]
|
92 |
-
- **Word Error Rate (WER):** [Value Needed]
|
93 |
|
94 |
|
95 |
|
@@ -102,11 +86,8 @@ The model achieves competitive accuracy on Urdu handwritten text recognition tas
|
|
102 |
- **Hardware Type:** NVIDIA GPU
|
103 |
|
104 |
|
105 |
-
## Technical Specifications
|
106 |
|
107 |
-
### Model Architecture and Objective
|
108 |
|
109 |
-
The model uses a VisionEncoderDecoder architecture combining a ViT encoder and a BERT decoder.
|
110 |
|
111 |
### Compute Infrastructure
|
112 |
|
@@ -133,15 +114,8 @@ Python, PyTorch, Hugging Face Transformers
|
|
133 |
|
134 |
|
135 |
|
136 |
-
## Glossary
|
137 |
-
|
138 |
-
- **CER:** Character Error Rate
|
139 |
-
- **WER:** Word Error Rate
|
140 |
-
- **OCR:** Optical Character Recognition
|
141 |
|
142 |
-
## More Information
|
143 |
|
144 |
-
[More Information Needed]
|
145 |
|
146 |
## Model Card Authors
|
147 |
|
|
|
19 |
|
20 |
<!-- Provide a quick summary of what the model is/does. -->
|
21 |
|
22 |
+
This model, cxfajar197/urdu-ocr, is trained on Urdu data specifically designed for OCR tasks. It works best with single-line Urdu text images, primarily focusing on printed text. The model is optimized for extracting accurate Urdu text from such images and can be easily utilized using the Hugging Face pipeline API.
|
23 |
|
|
|
24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
|
27 |
|
|
|
74 |
|
75 |
The model was tested on handwritten text images with varying font styles and complexities.
|
76 |
|
|
|
|
|
|
|
|
|
77 |
|
78 |
|
79 |
|
|
|
86 |
- **Hardware Type:** NVIDIA GPU
|
87 |
|
88 |
|
|
|
89 |
|
|
|
90 |
|
|
|
91 |
|
92 |
### Compute Infrastructure
|
93 |
|
|
|
114 |
|
115 |
|
116 |
|
|
|
|
|
|
|
|
|
|
|
117 |
|
|
|
118 |
|
|
|
119 |
|
120 |
## Model Card Authors
|
121 |
|