TM commited on
Commit
89a71b2
·
verified ·
1 Parent(s): 16d9c68

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +128 -2
README.md CHANGED
@@ -6,8 +6,134 @@ colorTo: blue
6
  sdk: streamlit
7
  sdk_version: 1.38.0
8
  app_file: app.py
9
- pinned: false
10
  short_description: OCR and Document Search Web App
 
 
11
  ---
 
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  sdk: streamlit
7
  sdk_version: 1.38.0
8
  app_file: app.py
9
+ pinned: true
10
  short_description: OCR and Document Search Web App
11
+ thumbnail: >-
12
+ https://cdn-uploads.huggingface.co/production/uploads/66f5a2b7e8b5fab5a730b3dd/fj1fjy4CEmOh8xIzY4Xae.jpeg
13
  ---
14
+ # OCR and Document Search Web Application Prototype Using GOT-OCR 2.0
15
 
16
+ This repository contains a **Streamlit**-based web application that leverages the **GOT 2.0** OCR model for optical character recognition (OCR) and document search functionality. The application is deployed on Hugging Face Spaces, running on **CPU**, which may cause slightly slower processing times.
17
+
18
+ ## Features
19
+
20
+ - **OCR:** Upload an image and extract text using the GOT 2.0 model.
21
+ - **Document Search:** Search for specific words or phrases in the extracted text.
22
+ - **Language Support:** Primarily supports English and Chinese, with limited accuracy for Hindi.
23
+ - **Deployment:** Hosted on Hugging Face Spaces and optimized for CPU-based execution.
24
+
25
+ ## Table of Contents
26
+
27
+ - [Installation](#installation)
28
+ - [Running Locally](#running-locally)
29
+ - [Deployment](#deployment)
30
+ - [Usage](#usage)
31
+ - [GOT 2.0 Model Details](#got-20-model-details)
32
+ - [Limitations](#limitations)
33
+
34
+ ## Installation
35
+
36
+ Follow these steps to set up the project environment:
37
+
38
+ ### 1. Clone the Repository
39
+
40
+ ```bash
41
+ git clone https://github.com/Tejasva-Maurya/OCR-and-Document-Search-Web-Application-Prototype-Using-GOT-OCR-2.0.git
42
+ cd OCR-and-Document-Search-Web-Application-Prototype-Using-GOT-OCR-2.0
43
+ ```
44
+
45
+ ### 2. Set up a Virtual Environment
46
+
47
+ Create a virtual environment to keep the project dependencies isolated.
48
+
49
+ For Python 3:
50
+
51
+ ```bash
52
+ python3 -m venv venv
53
+ source venv/bin/activate # On Windows use: venv\Scripts\activate
54
+ ```
55
+
56
+ ### 3. Install Dependencies
57
+
58
+ Install the required Python packages from `requirements.txt`:
59
+
60
+ ```bash
61
+ pip install -r requirements.txt
62
+ ```
63
+
64
+ ## Running Locally
65
+
66
+ To run the application locally, follow these steps:
67
+
68
+ ### 1. Run the Streamlit App
69
+
70
+ ```bash
71
+ streamlit run app.py
72
+ ```
73
+
74
+ This will start a local server, and you can access the application in your browser at `http://localhost:8501`.
75
+
76
+ ### 2. Upload Images and Perform OCR
77
+
78
+ Once the app is running, upload an image (e.g., JPG, PNG, JPEG), and the OCR model will extract the text. You can search for specific words or phrases within the extracted text.
79
+
80
+ ### 3. Project Structure
81
+
82
+ - `app.py`: Main application file.
83
+ - `images/`: Directory where uploaded images are saved.
84
+
85
+ ## Deployment
86
+
87
+ The application is deployed on **Hugging Face Spaces** using **Streamlit**. Due to **GPU-related complexities on Hugging Face**, the application runs on **CPU** by default. This might result in slower performance during OCR processing, but it avoids potential issues related to GPU execution.
88
+
89
+ ### 1. Set up Hugging Face Spaces
90
+
91
+ - Create a new repository on [Hugging Face Spaces](https://huggingface.co/spaces).
92
+ - Select **Streamlit** as the framework for deployment.
93
+
94
+ ### 2. Upload Your Code
95
+
96
+ Push your code to the Hugging Face Spaces repository you just created.
97
+
98
+ ```bash
99
+ git add .
100
+ git commit -m "Initial commit"
101
+ git push origin main
102
+ ```
103
+
104
+ ### 3. Configure the `requirements.txt`
105
+
106
+ Ensure your `requirements.txt` is correctly set up for the Hugging Face environment to install the necessary dependencies.
107
+
108
+ ### 4. Deploy
109
+
110
+ After pushing the code, Hugging Face Spaces will automatically build and deploy your app. You can access the app through the Hugging Face URL provided.
111
+
112
+ ## GOT 2.0 Model Details
113
+
114
+ The GOT 2.0 model is an advanced OCR model initially developed to support **English** and **Chinese** languages. The model uses a combination of pre-trained transformers and optimized tokenization techniques to extract text from images efficiently.
115
+
116
+ - **Pre-trained on English and Chinese:** While the model performs well for these languages, it may exhibit reduced accuracy when used for other languages like Hindi.
117
+ - **CPU Performance:** The model is optimized for **GPU**, but this application runs it on **CPU** due to Hugging Face limitations. This results in slower inference times.
118
+ - **Hugging Face Model:** The application loads the model from the Hugging Face repository: `srimanth-d/GOT_CPU`.
119
+
120
+ ## Limitations
121
+
122
+ ### 1. **CPU Processing**
123
+ - The application runs on CPU, which may result in longer processing times. GPU deployment is not feasible due to Hugging Face constraints on GPU support.
124
+
125
+ ### 2. **Hindi Language Support**
126
+ - The GOT 2.0 model was not specifically trained for Hindi, and the OCR accuracy for Hindi text is **limited**. It works best with **English** and **Chinese** images.
127
+
128
+ ### 3. **Model Size and Performance**
129
+ - The GOT 2.0 model is resource-intensive, and even on CPU, it may take additional time to process larger images or images with complex text.
130
+
131
+ ## Usage
132
+
133
+ 1. Open the app and upload an image to perform OCR.
134
+ 2. After the text is extracted, use the search box to find specific words or phrases in the document.
135
+ 3. The app will highlight the search terms in the extracted text and indicate whether the term was found.
136
+
137
+ ---
138
+
139
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference