Spaces:

nishantkaushik20
/

PoseEstimationYOLOv8

Build error

App Files Files Community

nishantkaushik20 commited on Sep 28, 2023

Commit

bc3ec38

1 Parent(s): c732d93

Upload 15 files

Browse files

Files changed (14) hide show

.gitignore +164 -0
README.md +5 -5
Readme copy.md +122 -0
app.py +95 -0
images/1.jpeg +0 -0
images/2.jpeg +0 -0
images/3.jpeg +0 -0
images/4.jpeg +0 -0
images/5.jpeg +0 -0
models/pose_classification.pth +3 -0
requirements.txt +6 -0
src/__init__.py +0 -0
src/classification_keypoint.py +54 -0
src/detection_keypoint.py +86 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,164 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/#use-with-ide
+.pdm.toml
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+#.idea/
+*.jpg
+*.pt
+huggingface

README.md CHANGED Viewed

@@ -1,10 +1,10 @@
 ---
-title: PoseEstimationYOLOv8
-emoji: 🚀
-colorFrom: blue
-colorTo: gray
 sdk: streamlit
-sdk_version: 1.27.0
 app_file: app.py
 pinned: false
 ---

 ---
+title: YoloV8 Pose Keypoint Classification
+emoji: 👀
+colorFrom: purple
+colorTo: purple
 sdk: streamlit
+sdk_version: 1.21.0
 app_file: app.py
 pinned: false
 ---

Readme copy.md ADDED Viewed

	@@ -0,0 +1,122 @@

+### Introduction
+Pose estimation is a task that involves identifying the location of specific points in an image, usually referred to as keypoints. The keypoints can represent various parts of the object such as joints, landmarks, or other distinctive features. The locations of the keypoints are usually represented as a set of 2D** **`[x, y]` **or 3D*`[x, y, visible]` coordinates.
+The output of a pose estimation model is a set of points that represent the keypoints on an object in the image, usually along with the confidence scores for each point. Pose estimation is a good choice when you need to identify specific parts of an object in a scene, and their location in relation to each other.
+### YOLOV8 Pose
+How to use YOLOv8 pretrained Pose models?
+```python
+from ultralytics import YOLO
+# Load a model
+model = YOLO('yolov8n-pose.pt')
+# Predict with the model
+results = model('https://ultralytics.com/images/bus.jpg')
+# Extract keypoint
+result_keypoint = results.keypoints.xyn.cpu().numpy()[0]
+```
+### Exploring Ouput Keypoint![](https://cdn-images-1.medium.com/max/800/1*PM5Q-58eNOWdoLogVKCGnQ.png)
+source : https://learnopencv.com/wp-content/uploads/2021/05/fix-overlay-issue.jpg
+In the output of YOLOv8 pose estimation, there are no keypoint names. Here’s sample output
+![](https://cdn-images-1.medium.com/max/800/1*Om_wkVg8tv0ou1BN1tQl_Q.png)
+To obtain the x, y coordinates by calling the keypoint name, you can create a Pydantic class with a “keypoint” attribute where the keys represent the keypoint names, and the values indicate the index of the keypoint in the YOLOv8 output.
+```python
+from pydantic import BaseModel
+class GetKeypoint(BaseModel):
+    NOSE:           int = 0
+    LEFT_EYE:       int = 1
+    RIGHT_EYE:      int = 2
+    LEFT_EAR:       int = 3
+    RIGHT_EAR:      int = 4
+    LEFT_SHOULDER:  int = 5
+    RIGHT_SHOULDER: int = 6
+    LEFT_ELBOW:     int = 7
+    RIGHT_ELBOW:    int = 8
+    LEFT_WRIST:     int = 9
+    RIGHT_WRIST:    int = 10
+    LEFT_HIP:       int = 11
+    RIGHT_HIP:      int = 12
+    LEFT_KNEE:      int = 13
+    RIGHT_KNEE:     int = 14
+    LEFT_ANKLE:     int = 15
+    RIGHT_ANKLE:    int = 16
+# example
+get_keypoint = GetKeypoint()
+nose_x, nose_y = result_yolov8[get_keypoint.NOSE]
+left_eye_x, left_eye_y = keypoint[get_keypoint.LEFT_EYE]
+```
+### Generate Dataset Keypoint
+To classify keypoints, you need to create a keypoint dataset. If you are using images from a public dataset on Kaggle [yoga-pose-classification](https://www.kaggle.com/datasets/ujjwalchowdhury/yoga-pose-classification). This dataset have 5 classes Downdog, Goddess, Plank, Tree, Warrior2. I will run pose estimation YoloV8 on each image and extract the output. I extracted the keypoints for each body part to obtain the x, y coordinates, and then I saved them in CSV format.
+![](https://cdn-images-1.medium.com/max/800/1*SBXgggGPWHPnoVCz9pLV6Q.png)
+column of dataset
+![](https://cdn-images-1.medium.com/max/800/1*KMwo_Htgmmi0DAJFLRZY3g.png)
+sample dataset
+### Train Classification
+Let’s proceed with training a multi-class classification model for keypoints using the PyTorch library for neural networks.
+```python
+class NeuralNet(nn.Module):
+    def __init__(self, input_size, hidden_size, num_classes):
+        super(NeuralNet, self).__init__()
+        self.l1 = nn.Linear(input_size, hidden_size)
+        self.relu = nn.ReLU()
+        self.l2 = nn.Linear(hidden_size, num_classes)
+    def forward(self, x):
+        out = self.l1(x)
+        out = self.relu(out)
+        out = self.l2(out)
+        return out
+hidden_size = 256
+model = NeuralNet(X_train.shape[1], hidden_size, len(class_weights))
+```
+The neural network architecture consists of two linear layers and a ReLU activation function:
+* `self.l1 = nn.Linear(input_size, hidden_size)`: The first linear layer, which takes the input features and maps them to the hidden layer.
+* `self.relu = nn.ReLU()`: The activation function, which applies element-wise rectified linear unit (ReLU) activation to introduce non-linearity.
+* `self.l2 = nn.Linear(hidden_size, num_classes)`: The second linear layer, which maps the hidden layer to the output classes.
+`forward(self, x)` This method defines the forward pass of the neural network. It takes an input tensor** **`x` and returns the output tensor. The forward pass involves passing the input through the defined layers in sequence and returning the final output.
+```python
+learning_rate = 0.01
+criterion = nn.CrossEntropyLoss(weight=torch.from_numpy(class_weights.astype(np.float32)))
+optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
+```
+In this code,`learning_rate` is set to 0.01, which controls the step size during optimization. The `CrossEntropyLoss`criterion is used for multi-class classification, and the `weight` parameter is set to the class weights converted to a PyTorch tensor. This allows for handling class imbalance if present in the dataset.
+The optimizer is defined as the Adam optimizer, which is a popular optimization algorithm for neural networks. It takes** `model.parameters()` as the input, which specifies the parameters of the model to be optimized. The **`lr` parameter sets the learning rate for the optimizer.
+### Training Keypoint Result
+The results are quite good for a simple Neural Network and the given dataset size, with an accuracy above 90%.
+![](https://cdn-images-1.medium.com/max/800/1*2cgx6lQExwpRBL8FuWknoQ.png)

app.py ADDED Viewed

	@@ -0,0 +1,95 @@

+# Import library
+import cv2
+import glob
+import numpy as np
+from PIL import Image
+import streamlit as st
+from src.detection_keypoint import DetectKeypoint
+from src.classification_keypoint import KeypointClassification
+detection_keypoint = DetectKeypoint()
+classification_keypoint = KeypointClassification(
+    './models/pose_classification.pth'
+)
+def pose_classification(img, col=None):
+    image = Image.open(img)
+    image_cv = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
+    image_rgb = cv2.cvtColor(image_cv, cv2.COLOR_BGR2RGB)
+    # show image col 1
+    col1.write("Original Image :")
+    col1.image(image_rgb)
+    # detection keypoint
+    results = detection_keypoint(image_cv)
+    results_keypoint = detection_keypoint.get_xy_keypoint(results)
+    # classification keypoint
+    input_classification = results_keypoint[10:]
+    results_classification = classification_keypoint(input_classification)
+    # visualize result
+    image_draw = results.plot(boxes=False)
+    x_min, y_min, x_max, y_max = results.boxes.xyxy[0].numpy()
+    image_draw = cv2.rectangle(
+                    image_draw,
+                    (int(x_min), int(y_min)),(int(x_max), int(y_max)),
+                    (0,0,255), 2
+                )
+    (w, h), _ = cv2.getTextSize(
+                    results_classification.upper(),
+                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, 2
+                )
+    image_draw = cv2.rectangle(
+                    image_draw,
+                    (int(x_min), int(y_min)-20),(int(x_min)+w, int(y_min)),
+                    (0,0,255), -1
+                )
+    image_draw = cv2.putText(image_draw,
+                    f'{results_classification.upper()}',
+                    (int(x_min), int(y_min)-4),
+                    cv2.FONT_HERSHEY_SIMPLEX,
+                    0.5, (255, 255, 255),
+                    thickness=2
+                )
+    image_draw = cv2.cvtColor(image_draw, cv2.COLOR_BGR2RGB)
+    col2.write("Keypoint Result :wrench:")
+    col2.image(image_draw)
+    col2.text(f'Pose Classification : {results_classification}')
+    return image_draw, results_classification
+st.set_page_config(
+    layout="wide",
+    page_title="YoloV8 Keypoint Classification"
+)
+st.write(
+    "## YoloV8 Keypoint Yoga Pose Classification"
+)
+st.write(
+    ":dog: Try uploading an image to Classification Yoga Basic Pose like a Downdog, Goddess, Plank, Tree, Warrior2 :grin:"
+)
+st.sidebar.write(
+    "## Upload Image :gear:"
+)
+col1, col2 = st.columns(2)
+img_upload = st.sidebar.file_uploader("Upload an image", type=["png", "jpg", "jpeg"])
+if img_upload is not None:
+    pose_classification(img=img_upload)
+# show sample image
+st.write('## Sample Image')
+images = glob.glob('./images/*.jpeg')
+row_size = len(images)
+grid = st.columns(row_size)
+col = 0
+for image in images:
+    with grid[col]:
+        st.image(f'{image}')
+        st.button(label='RUN', key=f'run_{image}',
+                  on_click=pose_classification, args=(image, 'run'))
+    col = (col + 1) % row_size

images/1.jpeg ADDED Viewed

images/2.jpeg ADDED Viewed

images/3.jpeg ADDED Viewed

images/4.jpeg ADDED Viewed

images/5.jpeg ADDED Viewed

models/pose_classification.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c8c94aa11d6fb05e0b351fa8cb3f705ee7dc4c8ba6cc2f0c87df5a97ea55e2ea
+size 32159

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+opencv-python
+torch
+streamlit
+pydantic
+ultralytics
+numpy

src/__init__.py ADDED Viewed

File without changes

src/classification_keypoint.py ADDED Viewed

	@@ -0,0 +1,54 @@

+import torch
+import torch.nn as nn
+class NeuralNet(nn.Module):
+    def __init__(
+        self,
+        input_size = 24,
+        hidden_size = 256,
+        num_classes = 5
+    ):
+        super(NeuralNet, self).__init__()
+        self.l1 = nn.Linear(input_size, hidden_size)
+        self.relu = nn.ReLU()
+        self.l2 = nn.Linear(hidden_size, num_classes)
+    def forward(self, x):
+        out = self.l1(x)
+        out = self.relu(out)
+        out = self.l2(out)
+        return out
+class KeypointClassification:
+    def __init__(self, path_model):
+        self.path_model = path_model
+        self.classes = ['Downdog', 'Goddess', 'Plank', 'Tree', 'Warrior2']
+        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+        self.load_model()
+    def load_model(self):
+        self.model = NeuralNet()
+        self.model.load_state_dict(
+            torch.load(self.path_model, map_location=self.device)
+        )
+    def __call__(self, input_keypoint):
+        if not type(input_keypoint) == torch.Tensor:
+            input_keypoint = torch.tensor(
+                input_keypoint, dtype=torch.float32
+            )
+        out = self.model(input_keypoint)
+        _, predict = torch.max(out, -1)
+        label_predict = self.classes[predict]
+        return label_predict
+if __name__ == '__main__':
+    keypoint_classification = KeypointClassification(
+        path_model='/Users/alimustofa/Me/source-code/AI/YoloV8_Pose_Classification/models/pose_classification.pt'
+    )
+    dummy_input = torch.randn(23)
+    classification = keypoint_classification(dummy_input)
+    print(classification)

src/detection_keypoint.py ADDED Viewed

	@@ -0,0 +1,86 @@

+import sys
+import cv2
+import numpy as np
+from pydantic import BaseModel
+import ultralytics
+from ultralytics.yolo.engine.results import Results
+# Define keypoint
+class GetKeypoint(BaseModel):
+    NOSE:           int = 0
+    LEFT_EYE:       int = 1
+    RIGHT_EYE:      int = 2
+    LEFT_EAR:       int = 3
+    RIGHT_EAR:      int = 4
+    LEFT_SHOULDER:  int = 5
+    RIGHT_SHOULDER: int = 6
+    LEFT_ELBOW:     int = 7
+    RIGHT_ELBOW:    int = 8
+    LEFT_WRIST:     int = 9
+    RIGHT_WRIST:    int = 10
+    LEFT_HIP:       int = 11
+    RIGHT_HIP:      int = 12
+    LEFT_KNEE:      int = 13
+    RIGHT_KNEE:     int = 14
+    LEFT_ANKLE:     int = 15
+    RIGHT_ANKLE:    int = 16
+class DetectKeypoint:
+    def __init__(self, yolov8_model='yolov8m-pose'):
+        self.yolov8_model = yolov8_model
+        self.get_keypoint = GetKeypoint()
+        self.__load_model()
+    def __load_model(self):
+        if not self.yolov8_model.split('-')[-1] == 'pose':
+            sys.exit('Model not yolov8 pose')
+        self.model = ultralytics.YOLO(model=self.yolov8_model)
+        # extract function keypoint
+    def extract_keypoint(self, keypoint: np.ndarray) -> list:
+        # nose
+        nose_x, nose_y = keypoint[self.get_keypoint.NOSE]
+        # eye
+        left_eye_x, left_eye_y = keypoint[self.get_keypoint.LEFT_EYE]
+        right_eye_x, right_eye_y = keypoint[self.get_keypoint.RIGHT_EYE]
+        # ear
+        left_ear_x, left_ear_y = keypoint[self.get_keypoint.LEFT_EAR]
+        right_ear_x, right_ear_y = keypoint[self.get_keypoint.RIGHT_EAR]
+        # shoulder
+        left_shoulder_x, left_shoulder_y = keypoint[self.get_keypoint.LEFT_SHOULDER]
+        right_shoulder_x, right_shoulder_y = keypoint[self.get_keypoint.RIGHT_SHOULDER]
+        # elbow
+        left_elbow_x, left_elbow_y = keypoint[self.get_keypoint.LEFT_ELBOW]
+        right_elbow_x, right_elbow_y = keypoint[self.get_keypoint.RIGHT_ELBOW]
+        # wrist
+        left_wrist_x, left_wrist_y = keypoint[self.get_keypoint.LEFT_WRIST]
+        right_wrist_x, right_wrist_y = keypoint[self.get_keypoint.RIGHT_WRIST]
+        # hip
+        left_hip_x, left_hip_y = keypoint[self.get_keypoint.LEFT_HIP]
+        right_hip_x, right_hip_y = keypoint[self.get_keypoint.RIGHT_HIP]
+        # knee
+        left_knee_x, left_knee_y = keypoint[self.get_keypoint.LEFT_KNEE]
+        right_knee_x, right_knee_y = keypoint[self.get_keypoint.RIGHT_KNEE]
+        # ankle
+        left_ankle_x, left_ankle_y = keypoint[self.get_keypoint.LEFT_ANKLE]
+        right_ankle_x, right_ankle_y = keypoint[self.get_keypoint.RIGHT_ANKLE]
+        return [
+            nose_x, nose_y, left_eye_x, left_eye_y, right_eye_x, right_eye_y,
+            left_ear_x, left_ear_y, right_ear_x, right_ear_y, left_shoulder_x, left_shoulder_y,
+            right_shoulder_x, right_shoulder_y, left_elbow_x, left_elbow_y, right_elbow_x, right_elbow_y,
+            left_wrist_x, left_wrist_y, right_wrist_x, right_wrist_y, left_hip_x, left_hip_y,
+            right_hip_x, right_hip_y, left_knee_x, left_knee_y, right_knee_x, right_knee_y,
+            left_ankle_x, left_ankle_y,right_ankle_x, right_ankle_y
+        ]
+    def get_xy_keypoint(self, results: Results) -> list:
+        result_keypoint = results.keypoints.xyn.cpu().numpy()[0]
+        keypoint_data = self.extract_keypoint(result_keypoint)
+        return keypoint_data
+    def __call__(self, image: np.array) -> Results:
+        results = self.model.predict(image, save=False)[0]
+        return results