kaixkhazaki
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -19,6 +19,7 @@ This model is a fine-tuned Vision Transformer (ViT) for document layout classifi
|
|
19 |
|
20 |
Trained on images of the document categories from DocLayNet dataset where the categories namely(with their indexes) are :
|
21 |
|
|
|
22 |
{'financial_reports': 0,
|
23 |
'government_tenders': 1,
|
24 |
'laws_and_regulations': 2,
|
@@ -26,6 +27,8 @@ Trained on images of the document categories from DocLayNet dataset where the ca
|
|
26 |
'patents': 4,
|
27 |
'scientific_articles': 5}
|
28 |
|
|
|
|
|
29 |
## Model description
|
30 |
|
31 |
This model is built upon the `google/vit-base-patch16-224-in21k` Vision Transformer architecture and fine-tuned specifically for document layout classification. The base ViT model uses a patch size of 16x16 pixels and was pre-trained on ImageNet-21k. The model has been optimized to recognize and classify different types of document layouts from the DocLayNet dataset.
|
|
|
19 |
|
20 |
Trained on images of the document categories from DocLayNet dataset where the categories namely(with their indexes) are :
|
21 |
|
22 |
+
```python
|
23 |
{'financial_reports': 0,
|
24 |
'government_tenders': 1,
|
25 |
'laws_and_regulations': 2,
|
|
|
27 |
'patents': 4,
|
28 |
'scientific_articles': 5}
|
29 |
|
30 |
+
```
|
31 |
+
|
32 |
## Model description
|
33 |
|
34 |
This model is built upon the `google/vit-base-patch16-224-in21k` Vision Transformer architecture and fine-tuned specifically for document layout classification. The base ViT model uses a patch size of 16x16 pixels and was pre-trained on ImageNet-21k. The model has been optimized to recognize and classify different types of document layouts from the DocLayNet dataset.
|