kaixkhazaki
/

deit_doclaynet_base

Image Classification

document-layout-analysis

document-classification

Inference Endpoints

Model card Files Files and versions Community

deit_doclaynet_base / README.md

kaixkhazaki's picture

Update README.md

a8ec4f6 verified 4 days ago

|

history blame contribute delete

1.51 kB

	---
	datasets:
	- pierreguillou/DocLayNet-base
	metrics:
	- accuracy
	base_model:
	- facebook/deit-base-distilled-patch16-224
	library_name: transformers
	tags:
	- vision
	- document-layout-analysis
	- document-classification
	- deit
	- doclaynet
	---

	# Data-efficient Image Transformer(DeiT) for Document Classification(DocLayNet)

	This model is a fine-tuned Data-efficient Image Transformer(DeiT) for document image classification based on the DocLayNet dataset.

	Trained on images of the document categories from DocLayNet dataset where the categories namely(with their indexes) are :

	{'financial_reports': 0,
	'government_tenders': 1,
	'laws_and_regulations': 2,
	'manuals': 3,
	'patents': 4,
	'scientific_articles': 5}
	## Model description

	DeiT(facebook/deit-base-distilled-patch16-224) finetuned on document classification



	## Training data
	DocLayNet-base
	https://huggingface.co/datasets/pierreguillou/DocLayNet-base

	## Training procedure


	hyperparameters:

	{
	'batch_size': 128,
	'num_epochs': 20,
	'learning_rate': 1e-4,
	'weight_decay': 0.1,
	'warmup_ratio': 0.1,
	'gradient_clip': 0.1,
	'dropout_rate': 0.1,
	'label_smoothing': 0.1
	'optmizer': 'AdamW'
	}

	## Evaluation results

	Test Loss: 0.8134, Test Acc: 81.56%


	## Usage
	```python
	from transformers import pipeline

	# Load the model using the image-classification pipeline
	pipe = pipeline("image-classification", model="kaixkhazaki/vit_doclaynet_base")

	# Test it with an image
	result = pipe("path_to_image.jpg")
	print(result)