geekyrakshit commited on
Commit
a414829
·
1 Parent(s): 65321e4

add: docs for entity recognition guardrails

Browse files
docs/guardrails/entity_recognition/entity_recognition_guardrails.md ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Entity Recognition Guardrails
2
+
3
+ A collection of guardrails for detecting and anonymizing various types of entities in text, including PII (Personally Identifiable Information), restricted terms, and custom entities.
4
+
5
+ ## Available Guardrails
6
+
7
+ ### 1. Regex Entity Recognition
8
+ Simple pattern-based entity detection using regular expressions.
9
+
10
+ ```python
11
+ from guardrails_genie.guardrails.entity_recognition import RegexEntityRecognitionGuardrail
12
+
13
+ # Initialize with default PII patterns
14
+ guardrail = RegexEntityRecognitionGuardrail(should_anonymize=True)
15
+
16
+ # Or with custom patterns
17
+ custom_patterns = {
18
+ "employee_id": r"EMP\d{6}",
19
+ "project_code": r"PRJ-[A-Z]{2}-\d{4}"
20
+ }
21
+ guardrail = RegexEntityRecognitionGuardrail(patterns=custom_patterns, should_anonymize=True)
22
+ ```
23
+
24
+ ### 2. Presidio Entity Recognition
25
+ Advanced entity detection using Microsoft's Presidio analyzer.
26
+
27
+ ```python
28
+ from guardrails_genie.guardrails.entity_recognition import PresidioEntityRecognitionGuardrail
29
+
30
+ # Initialize with default entities
31
+ guardrail = PresidioEntityRecognitionGuardrail(should_anonymize=True)
32
+
33
+ # Or with specific entities
34
+ selected_entities = ["CREDIT_CARD", "US_SSN", "EMAIL_ADDRESS"]
35
+ guardrail = PresidioEntityRecognitionGuardrail(
36
+ selected_entities=selected_entities,
37
+ should_anonymize=True
38
+ )
39
+ ```
40
+
41
+ ### 3. Transformers Entity Recognition
42
+ Entity detection using transformer-based models.
43
+
44
+ ```python
45
+ from guardrails_genie.guardrails.entity_recognition import TransformersEntityRecognitionGuardrail
46
+
47
+ # Initialize with default model
48
+ guardrail = TransformersEntityRecognitionGuardrail(should_anonymize=True)
49
+
50
+ # Or with specific model and entities
51
+ guardrail = TransformersEntityRecognitionGuardrail(
52
+ model_name="iiiorg/piiranha-v1-detect-personal-information",
53
+ selected_entities=["GIVENNAME", "SURNAME", "EMAIL"],
54
+ should_anonymize=True
55
+ )
56
+ ```
57
+
58
+ ### 4. LLM Judge for Restricted Terms
59
+ Advanced detection of restricted terms, competitor mentions, and brand protection using LLMs.
60
+
61
+ ```python
62
+ from guardrails_genie.guardrails.entity_recognition import RestrictedTermsJudge
63
+
64
+ # Initialize with OpenAI model
65
+ guardrail = RestrictedTermsJudge(should_anonymize=True)
66
+
67
+ # Check for specific terms
68
+ result = guardrail.guard(
69
+ text="Let's implement features like Salesforce",
70
+ custom_terms=["Salesforce", "Oracle", "AWS"]
71
+ )
72
+ ```
73
+
74
+ ## Usage
75
+
76
+ All guardrails follow a consistent interface:
77
+
78
+ ```python
79
+ # Initialize a guardrail
80
+ guardrail = RegexEntityRecognitionGuardrail(should_anonymize=True)
81
+
82
+ # Check text for entities
83
+ result = guardrail.guard("Hello, my email is [email protected]")
84
+
85
+ # Access results
86
+ print(f"Contains entities: {result.contains_entities}")
87
+ print(f"Detected entities: {result.detected_entities}")
88
+ print(f"Explanation: {result.explanation}")
89
+ print(f"Anonymized text: {result.anonymized_text}")
90
+ ```
91
+
92
+ ## Evaluation Tools
93
+
94
+ The module includes comprehensive evaluation tools and test cases:
95
+
96
+ - `pii_examples/`: Test cases for PII detection
97
+ - `banned_terms_examples/`: Test cases for restricted terms
98
+ - Benchmark scripts for evaluating model performance
99
+
100
+ ### Running Evaluations
101
+
102
+ ```python
103
+ # PII Detection Benchmark
104
+ from guardrails_genie.guardrails.entity_recognition.pii_examples.pii_benchmark import main
105
+ main()
106
+
107
+ # (TODO): Restricted Terms Testing
108
+ from guardrails_genie.guardrails.entity_recognition.banned_terms_examples.banned_term_benchmark import main
109
+ main()
110
+ ```
111
+
112
+ ## Features
113
+
114
+ - Entity detection and anonymization
115
+ - Support for multiple detection methods (regex, Presidio, transformers, LLMs)
116
+ - Customizable entity types and patterns
117
+ - Detailed explanations of detected entities
118
+ - Comprehensive evaluation framework
119
+ - Support for custom terms and patterns
120
+ - Batch processing capabilities
121
+ - Performance metrics and benchmarking
122
+
123
+ ## Response Format
124
+
125
+ All guardrails return responses with the following structure:
126
+
127
+ ```python
128
+ {
129
+ "contains_entities": bool,
130
+ "detected_entities": {
131
+ "entity_type": ["detected_value_1", "detected_value_2"]
132
+ },
133
+ "explanation": str,
134
+ "anonymized_text": Optional[str]
135
+ }
136
+ ```
docs/guardrails/entity_recognition/llm_judge_entity_recognition_guardrail.md ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ # LLM Judge for Entity Recognition Guardrail
2
+
3
+ ::: guardrails_genie.guardrails.entity_recognition.llm_judge_entity_recognition_guardrail
docs/guardrails/entity_recognition/presidio_entity_recognition_guardrail.md ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ # Presidio Entity Recognition Guardrail
2
+
3
+ ::: guardrails_genie.guardrails.entity_recognition.presidio_entity_recognition_guardrail
docs/guardrails/entity_recognition/regex_entity_recognition_guardrail.md ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ # Regex Entity Recognition Guardrail
2
+
3
+ ::: guardrails_genie.guardrails.entity_recognition.regex_entity_recognition_guardrail
docs/guardrails/entity_recognition/transformers_entity_recognition_guardrail.md ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ # Transformers Entity Recognition Guardrail
2
+
3
+ ::: guardrails_genie.guardrails.entity_recognition.transformers_entity_recognition_guardrail
guardrails_genie/guardrails/entity_recognition/llm_judge_entity_recognition_guardrail.py CHANGED
@@ -54,6 +54,36 @@ class RestrictedTermsRecognitionResponse(BaseModel):
54
 
55
 
56
  class RestrictedTermsJudge(Guardrail):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  llm_model: OpenAIModel = Field(default_factory=lambda: OpenAIModel())
58
  should_anonymize: bool = False
59
 
@@ -139,14 +169,32 @@ Return your analysis in the structured format specified by the RestrictedTermsAn
139
  **kwargs,
140
  ) -> RestrictedTermsRecognitionResponse:
141
  """
142
- Guard against restricted terms and their variations.
 
 
 
 
 
 
 
 
 
 
 
 
143
 
144
  Args:
145
- text: Text to analyze
146
- custom_terms: List of restricted terms to check for
 
 
 
 
147
 
148
  Returns:
149
- RestrictedTermsRecognitionResponse containing safety assessment and detailed analysis
 
 
150
  """
151
  analysis = self.predict(text, custom_terms, **kwargs)
152
 
 
54
 
55
 
56
  class RestrictedTermsJudge(Guardrail):
57
+ """
58
+ A class to detect and analyze restricted terms and their variations in text using an LLM model.
59
+
60
+ The RestrictedTermsJudge class extends the Guardrail class and utilizes an OpenAIModel
61
+ to identify restricted terms and their variations within a given text. It provides
62
+ functionality to format prompts for the LLM, predict restricted terms, and optionally
63
+ anonymize detected terms in the text.
64
+
65
+ !!! example "Using RestrictedTermsJudge"
66
+ ```python
67
+ from guardrails_genie.guardrails.entity_recognition import RestrictedTermsJudge
68
+
69
+ # Initialize with OpenAI model
70
+ guardrail = RestrictedTermsJudge(should_anonymize=True)
71
+
72
+ # Check for specific terms
73
+ result = guardrail.guard(
74
+ text="Let's implement features like Salesforce",
75
+ custom_terms=["Salesforce", "Oracle", "AWS"]
76
+ )
77
+ ```
78
+
79
+ Attributes:
80
+ llm_model (OpenAIModel): An instance of OpenAIModel used for predictions.
81
+ should_anonymize (bool): A flag indicating whether detected terms should be anonymized.
82
+
83
+ Args:
84
+ should_anonymize (bool): A flag indicating whether detected terms should be anonymized.
85
+ """
86
+
87
  llm_model: OpenAIModel = Field(default_factory=lambda: OpenAIModel())
88
  should_anonymize: bool = False
89
 
 
169
  **kwargs,
170
  ) -> RestrictedTermsRecognitionResponse:
171
  """
172
+ Analyzes the provided text to identify and handle restricted terms and their variations.
173
+
174
+ This function utilizes a predictive model to scan the input text for any occurrences of
175
+ specified restricted terms, including their variations such as misspellings, abbreviations,
176
+ and case differences. It returns a detailed analysis of the findings, including whether
177
+ restricted terms were detected, a summary of the matches, and an optional anonymized version
178
+ of the text.
179
+
180
+ The function operates by first calling the `predict` method to perform the analysis based on
181
+ the given text and custom terms. If restricted terms are found, it constructs a summary of
182
+ these findings. Additionally, if anonymization is enabled, it replaces detected terms in the
183
+ text with a redacted placeholder or a specific match type indicator, depending on the
184
+ `aggregate_redaction` flag.
185
 
186
  Args:
187
+ text (str): The text to be analyzed for restricted terms.
188
+ custom_terms (List[str]): A list of restricted terms to check against the text. Defaults
189
+ to a predefined list of company names.
190
+ aggregate_redaction (bool): Determines the anonymization strategy. If True, all matches
191
+ are replaced with "[redacted]". If False, matches are replaced
192
+ with their match type in uppercase.
193
 
194
  Returns:
195
+ RestrictedTermsRecognitionResponse: An object containing the results of the analysis,
196
+ including whether restricted terms were found, a dictionary of detected entities,
197
+ a summary explanation, and the anonymized text if applicable.
198
  """
199
  analysis = self.predict(text, custom_terms, **kwargs)
200
 
guardrails_genie/guardrails/entity_recognition/presidio_entity_recognition_guardrail.py CHANGED
@@ -36,6 +36,49 @@ class PresidioEntityRecognitionSimpleResponse(BaseModel):
36
 
37
  # TODO: Add support for transformers workflow and not just Spacy
38
  class PresidioEntityRecognitionGuardrail(Guardrail):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  @staticmethod
40
  def get_available_entities() -> List[str]:
41
  registry = RecognizerRegistry()
@@ -137,11 +180,24 @@ class PresidioEntityRecognitionGuardrail(Guardrail):
137
  self, prompt: str, return_detected_types: bool = True, **kwargs
138
  ) -> PresidioEntityRecognitionResponse | PresidioEntityRecognitionSimpleResponse:
139
  """
140
- Check if the input prompt contains any entities using Presidio.
 
 
 
 
 
 
141
 
142
  Args:
143
- prompt: The text to analyze
144
- return_detected_types: If True, returns detailed entity type information
 
 
 
 
 
 
 
145
  """
146
  # Analyze text for entities
147
  analyzer_results = self.analyzer.analyze(
 
36
 
37
  # TODO: Add support for transformers workflow and not just Spacy
38
  class PresidioEntityRecognitionGuardrail(Guardrail):
39
+ """
40
+ A guardrail class for entity recognition and anonymization using Presidio.
41
+
42
+ This class extends the Guardrail base class to provide functionality for
43
+ detecting and optionally anonymizing entities in text using the Presidio
44
+ library. It leverages Presidio's AnalyzerEngine and AnonymizerEngine to
45
+ perform these tasks.
46
+
47
+ !!! example "Using PresidioEntityRecognitionGuardrail"
48
+ ```python
49
+ from guardrails_genie.guardrails.entity_recognition import PresidioEntityRecognitionGuardrail
50
+
51
+ # Initialize with default entities
52
+ guardrail = PresidioEntityRecognitionGuardrail(should_anonymize=True)
53
+
54
+ # Or with specific entities
55
+ selected_entities = ["CREDIT_CARD", "US_SSN", "EMAIL_ADDRESS"]
56
+ guardrail = PresidioEntityRecognitionGuardrail(
57
+ selected_entities=selected_entities,
58
+ should_anonymize=True
59
+ )
60
+ ```
61
+
62
+ Attributes:
63
+ analyzer (AnalyzerEngine): The Presidio engine used for entity analysis.
64
+ anonymizer (AnonymizerEngine): The Presidio engine used for text anonymization.
65
+ selected_entities (List[str]): A list of entity types to detect in the text.
66
+ should_anonymize (bool): A flag indicating whether detected entities should be anonymized.
67
+ language (str): The language of the text to be analyzed.
68
+
69
+ Args:
70
+ selected_entities (Optional[List[str]]): A list of entity types to detect in the text.
71
+ should_anonymize (bool): A flag indicating whether detected entities should be anonymized.
72
+ language (str): The language of the text to be analyzed.
73
+ deny_lists (Optional[Dict[str, List[str]]]): A dictionary of entity types and their
74
+ corresponding deny lists.
75
+ regex_patterns (Optional[Dict[str, List[Dict[str, str]]]]): A dictionary of entity
76
+ types and their corresponding regex patterns.
77
+ custom_recognizers (Optional[List[Any]]): A list of custom recognizers to add to the
78
+ analyzer.
79
+ show_available_entities (bool): A flag indicating whether to print available entities.
80
+ """
81
+
82
  @staticmethod
83
  def get_available_entities() -> List[str]:
84
  registry = RecognizerRegistry()
 
180
  self, prompt: str, return_detected_types: bool = True, **kwargs
181
  ) -> PresidioEntityRecognitionResponse | PresidioEntityRecognitionSimpleResponse:
182
  """
183
+ Analyzes the input prompt for entity recognition using the Presidio framework.
184
+
185
+ This function utilizes the Presidio AnalyzerEngine to detect entities within the
186
+ provided text prompt. It supports custom recognizers, deny lists, and regex patterns
187
+ for entity detection. The detected entities are grouped by their types and an
188
+ explanation of the findings is generated. If anonymization is enabled, the detected
189
+ entities in the text are anonymized.
190
 
191
  Args:
192
+ prompt (str): The text to be analyzed for entity recognition.
193
+ return_detected_types (bool): Determines the type of response. If True, the
194
+ response includes detailed information about detected entity types.
195
+
196
+ Returns:
197
+ PresidioEntityRecognitionResponse | PresidioEntityRecognitionSimpleResponse:
198
+ A response object containing information about whether entities were detected,
199
+ the types and instances of detected entities, an explanation of the analysis,
200
+ and optionally, the anonymized text if anonymization is enabled.
201
  """
202
  # Analyze text for entities
203
  analyzer_results = self.analyzer.analyze(
guardrails_genie/guardrails/entity_recognition/regex_entity_recognition_guardrail.py CHANGED
@@ -30,6 +30,40 @@ class RegexEntityRecognitionSimpleResponse(BaseModel):
30
 
31
 
32
  class RegexEntityRecognitionGuardrail(Guardrail):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  regex_model: RegexModel
34
  patterns: Dict[str, str] = {}
35
  should_anonymize: bool = False
@@ -107,16 +141,29 @@ class RegexEntityRecognitionGuardrail(Guardrail):
107
  **kwargs,
108
  ) -> RegexEntityRecognitionResponse | RegexEntityRecognitionSimpleResponse:
109
  """
110
- Check if the input prompt contains any entities based on the regex patterns.
 
 
 
 
 
 
111
 
112
  Args:
113
- prompt: Input text to check for entities
114
- custom_terms: List of custom terms to be converted into regex patterns. If provided,
115
- only these terms will be checked, ignoring default patterns.
116
- return_detected_types: If True, returns detailed entity type information
 
 
 
 
117
 
118
  Returns:
119
- RegexEntityRecognitionResponse or RegexEntityRecognitionSimpleResponse containing detection results
 
 
 
120
  """
121
  if custom_terms:
122
  # Create a temporary RegexModel with only the custom patterns
 
30
 
31
 
32
  class RegexEntityRecognitionGuardrail(Guardrail):
33
+ """
34
+ A guardrail class for recognizing and optionally anonymizing entities in text using regular expressions.
35
+
36
+ This class extends the Guardrail base class and utilizes a RegexModel to detect entities in the input text
37
+ based on predefined or custom regex patterns. It provides functionality to check for entities, anonymize
38
+ detected entities, and return detailed information about the detected entities.
39
+
40
+ !!! example "Using RegexEntityRecognitionGuardrail"
41
+ ```python
42
+ from guardrails_genie.guardrails.entity_recognition import RegexEntityRecognitionGuardrail
43
+
44
+ # Initialize with default PII patterns
45
+ guardrail = RegexEntityRecognitionGuardrail(should_anonymize=True)
46
+
47
+ # Or with custom patterns
48
+ custom_patterns = {
49
+ "employee_id": r"EMP\d{6}",
50
+ "project_code": r"PRJ-[A-Z]{2}-\d{4}"
51
+ }
52
+ guardrail = RegexEntityRecognitionGuardrail(patterns=custom_patterns, should_anonymize=True)
53
+ ```
54
+
55
+ Attributes:
56
+ regex_model (RegexModel): An instance of RegexModel used for entity recognition.
57
+ patterns (Dict[str, str]): A dictionary of regex patterns for entity recognition.
58
+ should_anonymize (bool): A flag indicating whether detected entities should be anonymized.
59
+ DEFAULT_PATTERNS (ClassVar[Dict[str, str]]): A dictionary of default regex patterns for common entities.
60
+
61
+ Args:
62
+ use_defaults (bool): If True, use default patterns. If False, use custom patterns.
63
+ should_anonymize (bool): If True, anonymize detected entities.
64
+ show_available_entities (bool): If True, print available entity types.
65
+ """
66
+
67
  regex_model: RegexModel
68
  patterns: Dict[str, str] = {}
69
  should_anonymize: bool = False
 
141
  **kwargs,
142
  ) -> RegexEntityRecognitionResponse | RegexEntityRecognitionSimpleResponse:
143
  """
144
+ Analyzes the input prompt to detect entities based on predefined or custom regex patterns.
145
+
146
+ This function checks the provided text (prompt) for entities using regex patterns. It can
147
+ utilize either default patterns or custom terms provided by the user. If custom terms are
148
+ specified, they are converted into regex patterns, and only these are used for entity detection.
149
+ The function returns detailed information about detected entities and can optionally anonymize
150
+ the detected entities in the text.
151
 
152
  Args:
153
+ prompt (str): The input text to be analyzed for entity detection.
154
+ custom_terms (Optional[list[str]]): A list of custom terms to be converted into regex patterns.
155
+ If provided, only these terms will be checked, ignoring default patterns.
156
+ return_detected_types (bool): If True, the function returns detailed information about the
157
+ types of entities detected in the text.
158
+ aggregate_redaction (bool): Determines the anonymization strategy. If True, all detected
159
+ entities are replaced with a generic "[redacted]" label. If False, each entity type is
160
+ replaced with its specific label (e.g., "[ENTITY_TYPE]").
161
 
162
  Returns:
163
+ RegexEntityRecognitionResponse or RegexEntityRecognitionSimpleResponse: An object containing
164
+ the results of the entity detection, including whether entities were found, the types and
165
+ counts of detected entities, an explanation of the detection process, and optionally, the
166
+ anonymized text.
167
  """
168
  if custom_terms:
169
  # Create a temporary RegexModel with only the custom patterns
guardrails_genie/guardrails/entity_recognition/transformers_entity_recognition_guardrail.py CHANGED
@@ -29,7 +29,40 @@ class TransformersEntityRecognitionSimpleResponse(BaseModel):
29
 
30
 
31
  class TransformersEntityRecognitionGuardrail(Guardrail):
32
- """Generic guardrail for detecting entities using any token classification model."""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
  _pipeline: Optional[object] = None
35
  selected_entities: List[str]
@@ -161,12 +194,26 @@ class TransformersEntityRecognitionGuardrail(Guardrail):
161
  TransformersEntityRecognitionResponse
162
  | TransformersEntityRecognitionSimpleResponse
163
  ):
164
- """Check if the input prompt contains any entities using the transformer pipeline.
 
 
 
 
 
165
 
166
  Args:
167
- prompt: The text to analyze
168
- return_detected_types: If True, returns detailed entity type information
169
- aggregate_redaction: If True, uses generic [redacted] instead of entity type
 
 
 
 
 
 
 
 
 
170
  """
171
  # Detect entities
172
  detected_entities = self._detect_entities(prompt)
 
29
 
30
 
31
  class TransformersEntityRecognitionGuardrail(Guardrail):
32
+ """Generic guardrail for detecting entities using any token classification model.
33
+
34
+ This class leverages a transformer-based token classification model to detect and
35
+ optionally anonymize entities in a given text. It uses the HuggingFace `transformers`
36
+ library to load a pre-trained model and perform entity recognition.
37
+
38
+ !!! example "Using TransformersEntityRecognitionGuardrail"
39
+ ```python
40
+ from guardrails_genie.guardrails.entity_recognition import TransformersEntityRecognitionGuardrail
41
+
42
+ # Initialize with default model
43
+ guardrail = TransformersEntityRecognitionGuardrail(should_anonymize=True)
44
+
45
+ # Or with specific model and entities
46
+ guardrail = TransformersEntityRecognitionGuardrail(
47
+ model_name="iiiorg/piiranha-v1-detect-personal-information",
48
+ selected_entities=["GIVENNAME", "SURNAME", "EMAIL"],
49
+ should_anonymize=True
50
+ )
51
+ ```
52
+
53
+ Attributes:
54
+ _pipeline (Optional[object]): The transformer pipeline for token classification.
55
+ selected_entities (List[str]): List of entities to detect.
56
+ should_anonymize (bool): Flag indicating whether detected entities should be anonymized.
57
+ available_entities (List[str]): List of all available entities that the model can detect.
58
+
59
+ Args:
60
+ model_name (str): The name of the pre-trained model to use for entity recognition.
61
+ selected_entities (Optional[List[str]]): A list of specific entities to detect.
62
+ If None, all available entities will be used.
63
+ should_anonymize (bool): If True, detected entities will be anonymized.
64
+ show_available_entities (bool): If True, available entity types will be printed.
65
+ """
66
 
67
  _pipeline: Optional[object] = None
68
  selected_entities: List[str]
 
194
  TransformersEntityRecognitionResponse
195
  | TransformersEntityRecognitionSimpleResponse
196
  ):
197
+ """Analyze the input prompt for entity recognition and optionally anonymize detected entities.
198
+
199
+ This function utilizes a transformer-based pipeline to detect entities within the provided
200
+ text prompt. It returns a response indicating whether any entities were found, along with
201
+ detailed information about the detected entities if requested. The function can also anonymize
202
+ the detected entities in the text based on the specified parameters.
203
 
204
  Args:
205
+ prompt (str): The text to be analyzed for entity detection.
206
+ return_detected_types (bool): If True, the response includes detailed information about
207
+ the types of entities detected. Defaults to True.
208
+ aggregate_redaction (bool): If True, detected entities are anonymized using a generic
209
+ [redacted] marker. If False, the specific entity type is used in the redaction.
210
+ Defaults to True.
211
+
212
+ Returns:
213
+ TransformersEntityRecognitionResponse or TransformersEntityRecognitionSimpleResponse:
214
+ A response object containing information about the presence of entities, an explanation
215
+ of the detection process, and optionally, the anonymized text if entities were detected
216
+ and anonymization is enabled.
217
  """
218
  # Detect entities
219
  detected_entities = self._detect_entities(prompt)
mkdocs.yml CHANGED
@@ -62,6 +62,12 @@ nav:
62
  - Guardrails:
63
  - Guardrail Base Class: 'guardrails/base.md'
64
  - Guardrail Manager: 'guardrails/manager.md'
 
 
 
 
 
 
65
  - Prompt Injection Guardrails:
66
  - Classifier Guardrail: 'guardrails/prompt_injection/classifier.md'
67
  - Survey Guardrail: 'guardrails/prompt_injection/llm_survey.md'
 
62
  - Guardrails:
63
  - Guardrail Base Class: 'guardrails/base.md'
64
  - Guardrail Manager: 'guardrails/manager.md'
65
+ - Entity Recognition Guardrails:
66
+ - About: 'guardrails/entity_recognition/entity_recognition_guardrails.md'
67
+ - Regex Entity Recognition Guardrail: 'guardrails/entity_recognition/regex_entity_recognition_guardrail.md'
68
+ - Presidio Entity Recognition Guardrail: 'guardrails/entity_recognition/presidio_entity_recognition_guardrail.md'
69
+ - Transformers Entity Recognition Guardrail: 'guardrails/entity_recognition/transformers_entity_recognition_guardrail.md'
70
+ - LLM Judge for Entity Recognition Guardrail: 'guardrails/entity_recognition/llm_judge_entity_recognition_guardrail.md'
71
  - Prompt Injection Guardrails:
72
  - Classifier Guardrail: 'guardrails/prompt_injection/classifier.md'
73
  - Survey Guardrail: 'guardrails/prompt_injection/llm_survey.md'
pyproject.toml CHANGED
@@ -24,6 +24,7 @@ dependencies = [
24
  "torch>=2.5.1",
25
  "presidio-analyzer>=2.2.355",
26
  "presidio-anonymizer>=2.2.355",
 
27
  ]
28
 
29
  [project.optional-dependencies]
 
24
  "torch>=2.5.1",
25
  "presidio-analyzer>=2.2.355",
26
  "presidio-anonymizer>=2.2.355",
27
+ "instructor>=1.7.0",
28
  ]
29
 
30
  [project.optional-dependencies]