mwmathis commited on
Commit
9850fa6
·
1 Parent(s): d69a449

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +99 -24
README.md CHANGED
@@ -1,29 +1,60 @@
1
  ---
2
- license: apache-2.0
3
  tags:
4
  - computer_vision
5
  - pose_estimation
 
 
6
  ---
7
 
8
- Copyright 2021-2023 by Mackenzie Mathis, Alexander Mathis, Shaokai Ye and contributors. All rights reserved.
9
 
 
10
 
11
- - Please cite **Ye et al 2023** if you use this model in your work https://arxiv.org/abs/2203.07436v1
12
- - If this license is not suitable for your business or project
13
- please contact EPFL-TTO (https://tto.epfl.ch/) for a full commercial license.
14
 
15
- This software may not be used to harm any animal deliberately!
16
 
 
 
 
17
 
18
- **MODEL CARD:**
 
 
19
 
20
- This model was trained a dataset called "Quadrupred-40K." It was trained in PyTorch within a modifed [mmpose framework](https://github.com/open-mmlab/mmpose), available within the [DeepLabCut framework](www.deeplabcut.org) [here](https://github.com/DeepLabCut/DeepLabCut/pull/2352).
21
- Full training details can be found in Ye et al. 2023, but in brief, this was trained with **HRNetw32**. We have another version available directly within the tensorflow version of DeepLabCut: https://huggingface.co/mwmathis/DeepLabCutModelZoo-SuperAnimal-Quadruped.
 
 
 
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
- **Training Data:**
 
25
 
26
- It consists of being trained together on the following datasets, which were otherwise not modified by us:
 
 
 
 
 
 
 
 
27
 
28
  - **AwA-Pose** Quadruped dataset, see full details at (1).
29
  - **AnimalPose** See full details at (2).
@@ -31,30 +62,74 @@ It consists of being trained together on the following datasets, which were othe
31
  - **Horse-30** Horse-30 dataset, benchmark task is called Horse-10; See full details at (4).
32
  - **StanfordDogs** See full details at (5, 6).
33
  - **AP-10K** See full details at (7).
34
- - **iRodent** (https://zenodo.org/record/8250392) We utilized the iNaturalist API functions for scraping observations
35
  with the taxon ID of Suborder Myomorpha (8). The functions allowed us to filter the large amount of observations down to the
36
  ones with photos under the CC BY-NC creative license. The most common types of rodents from the collected observations are
37
  Muskrat (Ondatra zibethicus), Brown Rat (Rattus norvegicus), House Mouse (Mus musculus), Black Rat (Rattus rattus), Hispid
38
  Cotton Rat (Sigmodon hispidus), Meadow Vole (Microtus pennsylvanicus), Bank Vole (Clethrionomys glareolus), Deer Mouse
39
  (Peromyscus maniculatus), White-footed Mouse (Peromyscus leucopus), Striped Field Mouse (Apodemus agrarius). We then
40
  generated segmentation masks over target animals in the data by processing the media through an algorithm we designed that
41
- uses a Mask Region Based Convolutional Neural Networks(Mask R-CNN) (9) model with a ResNet-50-FPN backbone (10),
42
- pretrained on the COCO datasets (11). The processed 443 images were then manually labeled with both pose annotations and
43
- segmentation masks.
44
 
45
- Here is an image with the keypoint guide, the distribution of images per dataset, and examples from the datasets inferenced with a model trained with less data for benchmarking as in Ye et al 2023.
46
- Thereby note that performance of this model we are releasing has comporable or higher performance.
 
 
47
 
48
- Please note that each dataest was labeled by separate labs & seperate individuals, therefore while we map names
49
- to a unified pose vocabulary, there will be annotator bias in keypoint placement (See Ye et al. 2023 for our Supplementary Note on annotator bias).
50
  You will also note the dataset is highly diverse across species, but collectively has more representation of domesticated animals like dogs, cats, horses, and cattle.
51
  We recommend if performance is not as good as you need it to be, first try video adaptation (see Ye et al. 2023),
52
- or fine-tune these weights with your own labeling.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
 
54
- <p align="center">
55
- <img src="https://images.squarespace-cdn.com/content/v1/57f6d51c9f74566f55ecf271/1690988780004-AG00N6OU1R21MZ0AU9RE/modelcard-SAQ.png?format=1500w" width="95%">
56
- </p>
57
 
 
58
 
59
  1. Prianka Banik, Lin Li, and Xishuang Dong. A novel dataset for keypoint detection of quadruped animals from images. ArXiv, abs/2108.13958, 2021
60
  2. Jinkun Cao, Hongyang Tang, Haoshu Fang, Xiaoyong Shen, Cewu Lu, and Yu-Wing Tai. Cross-domain adaptation for animal pose estimation.
@@ -76,4 +151,4 @@ Conference on Neural Information Processing Systems Datasets and Benchmarks Trac
76
  vision, pages 2961–2969, 2017.
77
  10. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection, 2016.
78
  11. Tsung-Yi Lin, Michael Maire, Serge J. Belongie, Lubomir D. Bourdev, Ross B. Girshick, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll’ar,
79
- and C. Lawrence Zitnick. Microsoft COCO: common objects in context. CoRR, abs/1405.0312, 2014
 
1
  ---
 
2
  tags:
3
  - computer_vision
4
  - pose_estimation
5
+ - animal_pose_estimation
6
+ - deeplabcut
7
  ---
8
 
9
+ # MODEL CARD:
10
 
11
+ ## Model Details
12
 
13
+ • SuperAnimal-Quadruped model developed by the [M.W.Mathis Lab](http://www.mackenziemathislab.org/) in 2023, trained to predict quadruped pose from images.
14
+ Please see [Shaokai Ye et al. 2023](https://arxiv.org/abs/2203.07436) for details.
 
15
 
16
+ The model is an HRNet-w32 trained on our Quadruped-80K dataset.
17
 
18
+ • It was trained within the DeepLabCut framework. Full training details can be found in Ye et al. 2023.
19
+ You can use this model simply with our light-weight loading package called [DLCLibrary](https://github.com/DeepLabCut/DLClibrary).
20
+ Here is an example useage:
21
 
22
+ ```python
23
+ from pathlib import Path
24
+ from dlclibrary import download_huggingface_model
25
 
26
+ # Creates a folder and downloads the model to it
27
+ model_dir = Path("./superanimal_quadruped_model")
28
+ model_dir.mkdir()
29
+ download_huggingface_model("superanimal_quadruped", model_dir)
30
+ ```
31
 
32
+ ## Intended Use
33
+ • Intended to be used for pose estimation of quadruped images taken from side-view. The model serves a better starting
34
+ point than ImageNet weights in downstream datasets such as AP-10K.
35
+ • Intended for academic and research professionals working in fields related to animal behavior, such as neuroscience
36
+ and ecology.
37
+ • Not suitable as a zeros-shot model for applications that require high keypiont precision, but can be fine-tuned with
38
+ minimal data to reach human-level accuracy. Also not suitable for videos that look dramatically different from those
39
+ we show in the paper.
40
+ Factors
41
+ • Based on the known robustness issues of neural networks, the relevant factors include the lighting, contrast and
42
+ resolution of the video frames. The present of objects might also cause false detections and erroneous keypoints.
43
+ When two or more animals are extremely close, it could cause the top-down detectors to only detect only one animal,
44
+ if used without further fine-tuning or with a method such as BUCTD (36).
45
 
46
+ ## Metrics
47
+ • Mean Average Precision (mAP)
48
 
49
+ ## Evaluation Data
50
+ • In the paper we benchmark on AP-10K, AnimalPose, Horse-10, and iRodent using a leave-one-out strategy. Here,
51
+ we provide the model that has been trained on all datasets (see below), therefore it should be considered “fine-tuned"
52
+ on all animal training data listed below. This model is meant for production and evaluation in downstream scientific
53
+ applications.
54
+
55
+ ## Training Data:
56
+
57
+ It consists of being trained together on the following datasets:
58
 
59
  - **AwA-Pose** Quadruped dataset, see full details at (1).
60
  - **AnimalPose** See full details at (2).
 
62
  - **Horse-30** Horse-30 dataset, benchmark task is called Horse-10; See full details at (4).
63
  - **StanfordDogs** See full details at (5, 6).
64
  - **AP-10K** See full details at (7).
65
+ - **iRodent** We utilized the iNaturalist API functions for scraping observations
66
  with the taxon ID of Suborder Myomorpha (8). The functions allowed us to filter the large amount of observations down to the
67
  ones with photos under the CC BY-NC creative license. The most common types of rodents from the collected observations are
68
  Muskrat (Ondatra zibethicus), Brown Rat (Rattus norvegicus), House Mouse (Mus musculus), Black Rat (Rattus rattus), Hispid
69
  Cotton Rat (Sigmodon hispidus), Meadow Vole (Microtus pennsylvanicus), Bank Vole (Clethrionomys glareolus), Deer Mouse
70
  (Peromyscus maniculatus), White-footed Mouse (Peromyscus leucopus), Striped Field Mouse (Apodemus agrarius). We then
71
  generated segmentation masks over target animals in the data by processing the media through an algorithm we designed that
72
+ uses a Mask Region Based Convolutional Neural Networks(Mask R-CNN) (8) model with a ResNet-50-FPN backbone (9),
73
+ pretrained on the COCO datasets (10). The processed 443 images were then manually labeled with both pose annotations and
74
+ segmentation masks. iRodent data is banked at https://zenodo.org/record/8250392.
75
 
76
+ Here is an image with the keypoint guide:
77
+ <p align="center">
78
+ <img src="https://images.squarespace-cdn.com/content/v1/57f6d51c9f74566f55ecf271/1690988780004-AG00N6OU1R21MZ0AU9RE/modelcard-SAQ.png?format=1500w" width="95%">
79
+ </p>
80
 
81
+ Please note that each dataset was labeled by separate labs \& separate individuals, therefore while we map names
82
+ to a unified pose vocabulary (found here: https://github.com/AdaptiveMotorControlLab/modelzoo-figures), there will be annotator bias in keypoint placement (See the Supplementary Note on annotator bias).
83
  You will also note the dataset is highly diverse across species, but collectively has more representation of domesticated animals like dogs, cats, horses, and cattle.
84
  We recommend if performance is not as good as you need it to be, first try video adaptation (see Ye et al. 2023),
85
+ or fine-tune these weights with your own labeling.
86
+
87
+
88
+ ## Ethical Considerations
89
+
90
+ • No experimental data was collected for this model; all datasets used are cited.
91
+
92
+ ## Caveats and Recommendations
93
+
94
+ • The model may have reduced accuracy in scenarios with extremely varied lighting conditions or atypical animal
95
+ characteristics not well-represented in the training data.
96
+
97
+ • Please note that each dataest was labeled by separate labs & separate individuals, therefore while we map names to a
98
+ unified pose vocabulary, there will be annotator bias in keypoint placement (See Ye et al. 2023 for our Supplementary
99
+ Note on annotator bias). You will also note the dataset is highly diverse across species, but collectively has more
100
+ representation of domesticated animals like dogs, cats, horses, and cattle. We recommend if performance is not as
101
+ good as you need it to be, first try video adaptation (see Ye et al. 2023), or fine-tune these weights with your own
102
+ labeling.
103
+
104
+ ## License
105
+
106
+ Modified MIT.
107
+
108
+ Copyright 2023 by Mackenzie Mathis, Shaokai Ye, and contributors.
109
+
110
+ Permission is hereby granted to you (hereafter "LICENSEE") a fully-paid, non-exclusive,
111
+ and non-transferable license for academic, non-commercial purposes only (hereafter “LICENSE”)
112
+ to use the "MODEL" weights (hereafter "MODEL"), subject to the following conditions:
113
+
114
+ The above copyright notice and this permission notice shall be included in all copies or substantial
115
+ portions of the Software:
116
+
117
+ This software may not be used to harm any animal deliberately.
118
+
119
+ LICENSEE acknowledges that the MODEL is a research tool.
120
+ THE MODEL IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
121
+ BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
122
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
123
+ WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MODEL
124
+ OR THE USE OR OTHER DEALINGS IN THE MODEL.
125
+
126
+ If this license is not appropriate for your application, please contact Prof. Mackenzie W. Mathis
127
+ ([email protected]) and/or the TTO office at EPFL ([email protected]) for a commercial use license.
128
+
129
+ Please cite **Ye et al** if you use this model in your work https://arxiv.org/abs/2203.07436v2.
130
 
 
 
 
131
 
132
+ ## References
133
 
134
  1. Prianka Banik, Lin Li, and Xishuang Dong. A novel dataset for keypoint detection of quadruped animals from images. ArXiv, abs/2108.13958, 2021
135
  2. Jinkun Cao, Hongyang Tang, Haoshu Fang, Xiaoyong Shen, Cewu Lu, and Yu-Wing Tai. Cross-domain adaptation for animal pose estimation.
 
151
  vision, pages 2961–2969, 2017.
152
  10. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection, 2016.
153
  11. Tsung-Yi Lin, Michael Maire, Serge J. Belongie, Lubomir D. Bourdev, Ross B. Girshick, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll’ar,
154
+ and C. Lawrence Zitnick. Microsoft COCO: common objects in context. CoRR, abs/1405.0312, 2014