Update README.md
Browse files
README.md
CHANGED
@@ -6,14 +6,13 @@ tags:
|
|
6 |
- vision
|
7 |
- ocr
|
8 |
- segmentation
|
9 |
-
- coco
|
10 |
---
|
11 |
|
12 |
# TF-ID: Table/Figure IDentifier for academic papers
|
13 |
|
14 |
## Model Summary
|
15 |
|
16 |
-
TF-ID (Table/Figure IDentifier) is a family of object detection models finetuned to extract tables and figures in academic papers. They come in four versions:
|
17 |
| Model | Model size | Model Description |
|
18 |
| ------- | ------------- | ------------- |
|
19 |
| TF-ID-base[[HF]](https://huggingface.co/yifeihu/TF-ID-base) | 0.23B | Extract tables/figures and their caption text
|
@@ -22,8 +21,12 @@ TF-ID (Table/Figure IDentifier) is a family of object detection models finetuned
|
|
22 |
| TF-ID-large-no-caption[[HF]](https://huggingface.co/yifeihu/TF-ID-large-no-caption) | 0.77B | Extract tables/figures without caption text
|
23 |
All TF-ID models are finetuned from [microsoft/Florence-2](https://huggingface.co/microsoft/Florence-2-large-ft) checkpoints.
|
24 |
|
|
|
|
|
25 |
TF-ID models take an image of a single paper page as the input, and return bounding boxes for all tables and figures in the given page.
|
|
|
26 |
TF-ID-base and TF-ID-large draw bounding boxes around tables/figures and their caption text.
|
|
|
27 |
TF-ID-base-no-caption and TF-ID-large-no-caption draw bounding boxes around tables/figures without their caption text.
|
28 |
|
29 |
![image/png](https://huggingface.co/yifeihu/TF-ID-base/resolve/main/td-id-caption.png)
|
@@ -56,17 +59,15 @@ Use the code below to get started with the model.
|
|
56 |
|
57 |
```python
|
58 |
import requests
|
59 |
-
|
60 |
from PIL import Image
|
61 |
from transformers import AutoProcessor, AutoModelForCausalLM
|
62 |
|
63 |
-
|
64 |
-
|
65 |
-
processor = AutoProcessor.from_pretrained("microsoft/Florence-2-base-ft", trust_remote_code=True)
|
66 |
|
67 |
prompt = "<OD>"
|
68 |
|
69 |
-
url = "https://huggingface.co/
|
70 |
image = Image.open(requests.get(url, stream=True).raw)
|
71 |
|
72 |
inputs = processor(text=prompt, images=image, return_tensors="pt")
|
@@ -86,13 +87,18 @@ print(parsed_answer)
|
|
86 |
|
87 |
```
|
88 |
|
|
|
|
|
|
|
|
|
|
|
|
|
89 |
## BibTex and citation info
|
90 |
|
91 |
```
|
92 |
-
@
|
93 |
-
|
94 |
-
|
95 |
-
|
96 |
-
year={2023}
|
97 |
}
|
98 |
```
|
|
|
6 |
- vision
|
7 |
- ocr
|
8 |
- segmentation
|
|
|
9 |
---
|
10 |
|
11 |
# TF-ID: Table/Figure IDentifier for academic papers
|
12 |
|
13 |
## Model Summary
|
14 |
|
15 |
+
TF-ID (Table/Figure IDentifier) is a family of object detection models finetuned to extract tables and figures in academic papers created by [Yifei Hu](https://x.com/hu_yifei). They come in four versions:
|
16 |
| Model | Model size | Model Description |
|
17 |
| ------- | ------------- | ------------- |
|
18 |
| TF-ID-base[[HF]](https://huggingface.co/yifeihu/TF-ID-base) | 0.23B | Extract tables/figures and their caption text
|
|
|
21 |
| TF-ID-large-no-caption[[HF]](https://huggingface.co/yifeihu/TF-ID-large-no-caption) | 0.77B | Extract tables/figures without caption text
|
22 |
All TF-ID models are finetuned from [microsoft/Florence-2](https://huggingface.co/microsoft/Florence-2-large-ft) checkpoints.
|
23 |
|
24 |
+
The models were finetuned with papers from Hugging Face Daily Papers. All bounding boxes are manually annotated and checked by humans.
|
25 |
+
|
26 |
TF-ID models take an image of a single paper page as the input, and return bounding boxes for all tables and figures in the given page.
|
27 |
+
|
28 |
TF-ID-base and TF-ID-large draw bounding boxes around tables/figures and their caption text.
|
29 |
+
|
30 |
TF-ID-base-no-caption and TF-ID-large-no-caption draw bounding boxes around tables/figures without their caption text.
|
31 |
|
32 |
![image/png](https://huggingface.co/yifeihu/TF-ID-base/resolve/main/td-id-caption.png)
|
|
|
59 |
|
60 |
```python
|
61 |
import requests
|
|
|
62 |
from PIL import Image
|
63 |
from transformers import AutoProcessor, AutoModelForCausalLM
|
64 |
|
65 |
+
model = AutoModelForCausalLM.from_pretrained("yifeihu/TF-ID-base", trust_remote_code=True)
|
66 |
+
processor = AutoProcessor.from_pretrained("yifeihu/TF-ID-base", trust_remote_code=True)
|
|
|
67 |
|
68 |
prompt = "<OD>"
|
69 |
|
70 |
+
url = "https://huggingface.co/yifeihu/TF-ID-base/resolve/main/arxiv_2305_10853_5.png?download=true"
|
71 |
image = Image.open(requests.get(url, stream=True).raw)
|
72 |
|
73 |
inputs = processor(text=prompt, images=image, return_tensors="pt")
|
|
|
87 |
|
88 |
```
|
89 |
|
90 |
+
To visualize the results, see [this tutorial notebook](https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/how-to-finetune-florence-2-on-detection-dataset.ipynb) for more details.
|
91 |
+
|
92 |
+
## Finetuning Code and Dataset
|
93 |
+
|
94 |
+
Coming soon!
|
95 |
+
|
96 |
## BibTex and citation info
|
97 |
|
98 |
```
|
99 |
+
@misc{TF-ID,
|
100 |
+
url={[https://huggingface.co/yifeihu/TF-ID-base](https://huggingface.co/yifeihu/TF-ID-base)},
|
101 |
+
title={TF-ID: Table/Figure IDentifier for academic papers},
|
102 |
+
author={"Yifei Hu"}
|
|
|
103 |
}
|
104 |
```
|