ImageDataExtractor3

Runtime error

App Files Files Community

WebashalarForML commited on Oct 4, 2024

Commit

06f01a0

verified ·

1 Parent(s): 49b2f52

Create README2.md

Browse files

Files changed (1) hide show

README2.md +129 -0

README2.md ADDED Viewed

	@@ -0,0 +1,129 @@

+_\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\_
+_\\-------- **Image Data Extractor** -------\\_
+_\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\_
+# Overview:
+The **Image Data Extractor** is a Python-based tool designed to extract and structure text data from images of visiting cards using **PaddleOCR**. The tool processes the extracted text to recognize key information such as name, designation, contact number, address, and company name, organizing the output into a well-defined structure. The **Mistral 7B model** is used for advanced text analysis, and if it becomes unavailable, the system seamlessly switches to the **Gliner urchade/gliner_mediumv2.1** model.
+# Installation Guide:
+1. **Create and Activate a Virtual Environment**
+    ```bash
+    python -m venv venv
+    source venv/bin/activate  # For Linux/Mac
+    # or
+    venv\Scripts\activate  # For Windows
+    ```
+2. **Install Required Libraries**
+    ```bash
+    pip install -r requirements.txt
+    ```
+3. **Run the Application**
+    - If Docker is being used:
+    ```bash
+    docker-compose up --build
+    ```
+    - Without Docker:
+    ```bash
+    python app.py
+    ```
+4. **Set up Hugging Face Token**
+    - Add your Hugging Face token in the `.env` file:
+    ```bash
+    HF_TOKEN=<your_huggingface_token>
+    ```
+# File Structure Overview:
+```
+ImageDataExtractor/
+│
+├── app.py                       # Main Flask app
+├── requirements.txt             # Dependencies
+├── Dockerfile                   # Docker container setup
+├── docker-compose.yml           # Docker Compose setup
+│
+├── utility/
+│   └── utils.py                 # PaddleOCR integration, Image preprocessing and Mistral model processing
+│
+├── template/
+│   ├── index.html               # UI for image uploads
+│   └── result.html              # Display extracted results
+│
+├── Backup/
+│   ├── modules/                 # Base classes for data processing models
+│   │   └── base.py
+│   │   └── data_proc.py
+│   │   └── evaluator.py
+│   │   └── layers.py
+│   │   └── run_evaluation.py
+│   │   └── span_rep.py
+│   │   └── token_rep.py
+│   ├── backup.py                # Backup handling
+│   └── model.py                 # Gliner Model integration and backup logic
+│   └── save_load.py             # Mistral 7B model integration and backup logic
+│   └── train.py                 # Mistral 7B model integration and backup logic
+│
+└── .env                         # Environment variables (includes Hugging Face token)
+```
+# Program Overview:
+### PaddleOCR Integration (utility/utils.py):
+- **Text Extraction**: The tool utilizes **PaddleOCR** to extract text from image-based inputs (PNG, JPG, JPEG) of visiting cards.
+- **Preprocessing**: Handles basic image preprocessing to enhance text recognition for OCR.
+### Mistral 7B Integration (utility/utils.py):
+- **Data Structuring**: After text extraction, the **Mistral 7B model** processes the extracted data, structuring it into fields such as name, designation, contact number, address, and company name.
+### Fallback Mechanism (Backup/backup.py):
+- **Gliner urchade/gliner_mediumv2.1 Model**: If the Mistral model is unavailable, the system uses the **Gliner urchade/gliner_mediumv2.1 model** to perform the same task, ensuring continuous service.
+- **Error Handling**: Manages failures in model availability and ensures smooth fallback.
+### Web Interface (app.py):
+- **Flask API**: Provides endpoints for image uploads and displays the results in a structured manner.
+- **HTML Interface**: A frontend for users to upload images of visiting cards and view the parsed results.
+# Tree Map of the Program:
+```
+app.py
+├── Handles Flask API and web interface
+├── Manages file upload
+├── Extracts text with PaddleOCR
+├── Processes text with Mistral 7B
+└── Displays structured results
+utility/utils.py
+├── PaddleOCR for text extraction
+└── Mistral 7B for data structuring
+Backup/backup.py
+├── Gliner urchade/gliner_mediumv2.1 as fallback
+└── Backup and error handling
+Backup/model.py
+└── Mistral 7B integration and processing logic
+```
+# Main Task:
+The main objective is to extract and structure text data from visiting cards. The system identifies and organizes:
+- **Name**
+- **Designation**
+- **Phone Number**
+- **Address**
+- **Company Name**
+# References:
+- [PaddleOCR Documentation](https://github.com/PaddlePaddle/PaddleOCR)
+- [Mistral 7B Documentation](https://huggingface.co/)
+- [Gliner urchade/gliner_mediumv2.1 Documentation](https://huggingface.co/)
+- [Flask Documentation](https://flask.palletsprojects.com/)
+- [Docker Documentation](https://docs.docker.com/)
+- [Virtual Environments in Python](https://docs.python.org/3/tutorial/venv.html)
+---