yifeihu commited on
Commit
defc0fe
·
verified ·
1 Parent(s): d8261b5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -12
README.md CHANGED
@@ -6,14 +6,13 @@ tags:
6
  - vision
7
  - ocr
8
  - segmentation
9
- - coco
10
  ---
11
 
12
  # TF-ID: Table/Figure IDentifier for academic papers
13
 
14
  ## Model Summary
15
 
16
- TF-ID (Table/Figure IDentifier) is a family of object detection models finetuned to extract tables and figures in academic papers. They come in four versions:
17
  | Model | Model size | Model Description |
18
  | ------- | ------------- | ------------- |
19
  | TF-ID-base[[HF]](https://huggingface.co/yifeihu/TF-ID-base) | 0.23B | Extract tables/figures and their caption text
@@ -22,8 +21,12 @@ TF-ID (Table/Figure IDentifier) is a family of object detection models finetuned
22
  | TF-ID-large-no-caption[[HF]](https://huggingface.co/yifeihu/TF-ID-large-no-caption) | 0.77B | Extract tables/figures without caption text
23
  All TF-ID models are finetuned from [microsoft/Florence-2](https://huggingface.co/microsoft/Florence-2-large-ft) checkpoints.
24
 
 
 
25
  TF-ID models take an image of a single paper page as the input, and return bounding boxes for all tables and figures in the given page.
 
26
  TF-ID-base and TF-ID-large draw bounding boxes around tables/figures and their caption text.
 
27
  TF-ID-base-no-caption and TF-ID-large-no-caption draw bounding boxes around tables/figures without their caption text.
28
 
29
  ![image/png](https://huggingface.co/yifeihu/TF-ID-base/resolve/main/td-id-caption.png)
@@ -56,17 +59,15 @@ Use the code below to get started with the model.
56
 
57
  ```python
58
  import requests
59
-
60
  from PIL import Image
61
  from transformers import AutoProcessor, AutoModelForCausalLM
62
 
63
-
64
- model = AutoModelForCausalLM.from_pretrained("microsoft/Florence-2-base-ft", trust_remote_code=True)
65
- processor = AutoProcessor.from_pretrained("microsoft/Florence-2-base-ft", trust_remote_code=True)
66
 
67
  prompt = "<OD>"
68
 
69
- url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
70
  image = Image.open(requests.get(url, stream=True).raw)
71
 
72
  inputs = processor(text=prompt, images=image, return_tensors="pt")
@@ -86,13 +87,18 @@ print(parsed_answer)
86
 
87
  ```
88
 
 
 
 
 
 
 
89
  ## BibTex and citation info
90
 
91
  ```
92
- @article{xiao2023florence,
93
- title={Florence-2: Advancing a unified representation for a variety of vision tasks},
94
- author={Xiao, Bin and Wu, Haiping and Xu, Weijian and Dai, Xiyang and Hu, Houdong and Lu, Yumao and Zeng, Michael and Liu, Ce and Yuan, Lu},
95
- journal={arXiv preprint arXiv:2311.06242},
96
- year={2023}
97
  }
98
  ```
 
6
  - vision
7
  - ocr
8
  - segmentation
 
9
  ---
10
 
11
  # TF-ID: Table/Figure IDentifier for academic papers
12
 
13
  ## Model Summary
14
 
15
+ TF-ID (Table/Figure IDentifier) is a family of object detection models finetuned to extract tables and figures in academic papers created by [Yifei Hu](https://x.com/hu_yifei). They come in four versions:
16
  | Model | Model size | Model Description |
17
  | ------- | ------------- | ------------- |
18
  | TF-ID-base[[HF]](https://huggingface.co/yifeihu/TF-ID-base) | 0.23B | Extract tables/figures and their caption text
 
21
  | TF-ID-large-no-caption[[HF]](https://huggingface.co/yifeihu/TF-ID-large-no-caption) | 0.77B | Extract tables/figures without caption text
22
  All TF-ID models are finetuned from [microsoft/Florence-2](https://huggingface.co/microsoft/Florence-2-large-ft) checkpoints.
23
 
24
+ The models were finetuned with papers from Hugging Face Daily Papers. All bounding boxes are manually annotated and checked by humans.
25
+
26
  TF-ID models take an image of a single paper page as the input, and return bounding boxes for all tables and figures in the given page.
27
+
28
  TF-ID-base and TF-ID-large draw bounding boxes around tables/figures and their caption text.
29
+
30
  TF-ID-base-no-caption and TF-ID-large-no-caption draw bounding boxes around tables/figures without their caption text.
31
 
32
  ![image/png](https://huggingface.co/yifeihu/TF-ID-base/resolve/main/td-id-caption.png)
 
59
 
60
  ```python
61
  import requests
 
62
  from PIL import Image
63
  from transformers import AutoProcessor, AutoModelForCausalLM
64
 
65
+ model = AutoModelForCausalLM.from_pretrained("yifeihu/TF-ID-base", trust_remote_code=True)
66
+ processor = AutoProcessor.from_pretrained("yifeihu/TF-ID-base", trust_remote_code=True)
 
67
 
68
  prompt = "<OD>"
69
 
70
+ url = "https://huggingface.co/yifeihu/TF-ID-base/resolve/main/arxiv_2305_10853_5.png?download=true"
71
  image = Image.open(requests.get(url, stream=True).raw)
72
 
73
  inputs = processor(text=prompt, images=image, return_tensors="pt")
 
87
 
88
  ```
89
 
90
+ To visualize the results, see [this tutorial notebook](https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/how-to-finetune-florence-2-on-detection-dataset.ipynb) for more details.
91
+
92
+ ## Finetuning Code and Dataset
93
+
94
+ Coming soon!
95
+
96
  ## BibTex and citation info
97
 
98
  ```
99
+ @misc{TF-ID,
100
+ url={[https://huggingface.co/yifeihu/TF-ID-base](https://huggingface.co/yifeihu/TF-ID-base)},
101
+ title={TF-ID: Table/Figure IDentifier for academic papers},
102
+ author={"Yifei Hu"}
 
103
  }
104
  ```