WebashalarForML's picture
Update templates/guide.html
63af6dc verified
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>SpaCy NER Training Guide</title>
<link
rel="stylesheet"
href="https://maxcdn.bootstrapcdn.com/bootstrap/4.5.2/css/bootstrap.min.css"
/>
<style>
body {
background-color: #121212;
font-family: "Poppins", sans-serif;
color: #e0e0e0;
margin: 0;
padding: 0;
}
h1,
h2 {
color: #007bff;
}
.step {
margin-bottom: 30px;
border: 1px solid #007bff;
border-radius: 5px;
padding: 20px;
background-color: #1e1e1e;
}
.btn-primary {
color: #fff;
background-color: #007bff;
border: 1px solid #007bff;
}
.btn-primary:hover {
background-color: transparent;
border: 1px solid #007bff;
}
</style>
</head>
<body>
<div class="container">
<h1>SpaCy NER Model Training Guide</h1>
<div class="step">
<h2>Step 1: Upload Your Resume File</h2>
<p>
Upload a resume or document file for text extraction. Supported
formats include:
</p>
<ul>
<li>PDF</li>
<li>DOCX (Word Document)</li>
<li>RSF (Rich Structured Format)</li>
<li>ODT (Open Document Text)</li>
<li>PNG, JPG, JPEG (Image Formats)</li>
<li>JSON</li>
</ul>
<p>
Ensure that your file is in one of the supported formats before
uploading. The system will extract and process the text from your
document automatically.
</p>
<a href="{{ url_for('index') }}" class="btn btn-primary"
>Proceed to Upload</a
>
</div>
<div class="step">
<h2>Step 2: Preview and Edit Extracted Text</h2>
<p>
After uploading your document, you will be shown a preview of the
extracted text. This preview allows you to edit the text if needed to
correct any extraction errors or remove unwanted content. Once you're
satisfied, click "Next" to proceed to Named Entity Recognition (NER)
annotations.
</p>
<a href="{{ url_for('text_preview') }}" class="btn btn-primary"
>Proceed to Text Preview</a
>
</div>
<div class="step">
<h2>Step 3: Annotate Named Entities</h2>
<p>
In this step, you will preview the Named Entity Recognition (NER)
results generated from your text. You can add new entity labels,
select relevant text for each label, and make manual adjustments. Once
you’ve annotated the text with the appropriate labels, save your
annotations and export the data in JSON format for model training.
NOTE:(following labels can be taken in use: ["ABOUT","CERTIFICATE",
"COMPANY","CONTACT","COURSE", "DOB", "EMAIL", "EXPERIENCE", "HOBBIES",
"INSTITUTE", "JOB_TITLE", "LANGUAGE", "LAST_QUALIFICATION_YEAR", "LINK",
"LOCATION", "PERSON", "PROJECTS", "QUALIFICATION", "SCHOOL", "SKILL",
"SOFT_SKILL", "UNIVERSITY", "YEARS_EXPERIENCE"])
</p>
<p>Instructions:</p>
<ul>
<li>Click "Begin!" to load the extracted text.</li>
<li>
Highlight sections of the text and assign them to the available
labels.
</li>
<li>Add new labels if necessary.</li>
<li>
Once done, click "Export" to download your annotations as a JSON
file.
</li>
</ul>
<a href="{{ url_for('ner_preview') }}" class="btn btn-primary"
>Proceed to NER Annotation</a
>
</div>
<div class="step">
<h2>Step 4: Save and Format JSON Data</h2>
<p>
Upload your annotated JSON file from the previous step. The system
will process and reformat the JSON file to ensure compatibility with
the SpaCy model training process. After formatting, you can proceed to
the model training step.
</p>
<p>Instructions:</p>
<ul>
<li>
Upload the JSON file you downloaded after the annotation step.
</li>
<li>Click "Process" to reformat the file.</li>
<li>
Once processing is complete, click "Next" to proceed with training.
</li>
</ul>
<a href="{{ url_for('json_file') }}" class="btn btn-primary"
>Proceed to Save JSON</a
>
</div>
<div class="step">
<h2>Step 5: Train the NER Model</h2>
<p>
In this final step, you will convert the formatted JSON data into the
SpaCy format and begin training the NER model. You can customize the
training by selecting the number of epochs (iterations) the model will
go through and setting the version for the trained model.
</p>
<p>Guidelines:</p>
<ul>
<li>
Number of epochs: The higher the number of epochs, the more times
the model will learn from the data, but too many epochs can lead to
overfitting. Start with 10 epochs for a balanced training approach.
</li>
<li>
Model versioning: Provide a version name for this training session,
so you can keep track of different versions of the model.
</li>
</ul>
<p>
Once the training is complete, you can download the latest version of
the trained model for use in production.
</p>
<a href="{{ url_for('spacy_file') }}" class="btn btn-primary"
>Proceed to Model Training</a
>
</div>
</div>
<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@popperjs/[email protected]/dist/umd/popper.min.js"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/js/bootstrap.min.js"></script>
</body>
</html>