King-Harry
/

Ninja-Masker-2-PII-Redaction

@@ -63,21 +63,27 @@ The model is designed for responsible data management, ensuring that sensitive i
 To use this model, you can load it from the Hugging Face Hub and integrate it into your Python or API-based applications. Below is an example of how to load and use the model:
 ```python
-# Install necessary packages
 !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
 !pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
 from transformers import AutoModelForCausalLM, AutoTokenizer
 from unsloth import FastLanguageModel
-# Load the fine-tuned model from Hugging Face Hub
 model_name = "King-Harry/Ninja-Masker-2-PII-Redaction"
 model, tokenizer = FastLanguageModel.from_pretrained(model_name, load_in_4bit=True)
-# Ensure the model is ready for inference
 FastLanguageModel.for_inference(model)
-# Define the Alpaca-style prompt
 alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
 ### Instruction:
@@ -89,7 +95,8 @@ alpaca_prompt = """Below is an instruction that describes a task, paired with an
 ### Response:
 {}"""
-# Define the input text using the Alpaca prompt
 inputs = tokenizer(
     [
         alpaca_prompt.format(
@@ -98,15 +105,20 @@ inputs = tokenizer(
             ""  # output - leave this blank for generation!
         )
     ],
-    return_tensors="pt"
-).to("cuda")
-# Generate the redacted output
 outputs = model.generate(**inputs, max_new_tokens=64, use_cache=True)
-# Decode and print the output
 redacted_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)
 print(redacted_text[0])
 ```

 To use this model, you can load it from the Hugging Face Hub and integrate it into your Python or API-based applications. Below is an example of how to load and use the model:
 ```python
+Here's the code with comments explaining each line:
+```python
+# Install necessary packages from the unsloth GitHub repository and others required for model handling and optimizations.
 !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
 !pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
+# Import the necessary classes from the transformers and unsloth libraries.
 from transformers import AutoModelForCausalLM, AutoTokenizer
 from unsloth import FastLanguageModel
+# Specify the name of the fine-tuned model hosted on Hugging Face Hub and load it along with its tokenizer.
+# The model is loaded in 4-bit precision to optimize memory usage and speed.
 model_name = "King-Harry/Ninja-Masker-2-PII-Redaction"
 model, tokenizer = FastLanguageModel.from_pretrained(model_name, load_in_4bit=True)
+# Prepare the model for inference mode, ensuring it's optimized for generating predictions.
 FastLanguageModel.for_inference(model)
+# Define a prompt template in the style of the Alpaca instruction-based format.
+# This template will be used to format the input text for the model.
 alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
 ### Instruction:
 ### Response:
 {}"""
+# Format the input text using the Alpaca-style prompt with a specific instruction and input.
+# The tokenizer encodes the formatted text into tensors suitable for the model to process.
 inputs = tokenizer(
     [
         alpaca_prompt.format(
             ""  # output - leave this blank for generation!
         )
     ],
+    return_tensors="pt"  # Return the encoded inputs as PyTorch tensors.
+).to("cuda")  # Move the tensors to the GPU (CUDA) for faster processing.
+# Generate the model's output based on the input prompt, limiting the output to a maximum of 64 new tokens.
+# The use_cache parameter is set to True to utilize past key values for faster generation.
 outputs = model.generate(**inputs, max_new_tokens=64, use_cache=True)
+# Decode the generated output from the model, converting the tokenized output back into human-readable text.
+# The skip_special_tokens argument ensures that special tokens used by the model (like padding or start tokens) are omitted.
 redacted_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)
+# Print the first item in the list of decoded texts, which should be the redacted version of the input text.
 print(redacted_text[0])
+```
 ```