Spaces:

ml6team
/

keyphrase-extraction

Runtime error

App Files Files Community

Thomas De Decker commited on May 25, 2022

Commit

f6df2a0

1 Parent(s): cf55b94

Fix truncation bug + Update description

Browse files

Files changed (6) hide show

README.md +0 -9
app.py +1 -3
pipelines/__pycache__/keyphrase_extraction_pipeline.cpython-39.pyc +0 -0
pipelines/__pycache__/keyphrase_generation_pipeline.cpython-39.pyc +0 -0
pipelines/keyphrase_extraction_pipeline.py +3 -1
pipelines/keyphrase_generation_pipeline.py +1 -1

README.md CHANGED Viewed

@@ -8,15 +8,6 @@ sdk_version: 1.2.0
 app_file: app.py
 pinned: false
 license: mit
-models:
-  - DeDeckerThomas/keyphrase-extraction-kbir-inspec
-  - DeDeckerThomas/keyphrase-extraction-distilbert-openkp
-  - DeDeckerThomas/keyphrase-extraction-distilbert-kptimes
-  - DeDeckerThomas/keyphrase-extraction-distilbert-inspec
-  - DeDeckerThomas/keyphrase-extraction-kbir-kpcrowd
-  - DeDeckerThomas/keyphrase-generation-keybart-inspec
-  - DeDeckerThomas/keyphrase-generation-t5-small-inspec
-  - DeDeckerThomas/keyphrase-generation-t5-small-openkp
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces#reference

 app_file: app.py
 pinned: false
 license: mit
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces#reference

app.py CHANGED Viewed

@@ -35,7 +35,6 @@ def get_annotated_text(text, keyphrases, color="#d294ff"):
             rf"$K:{keyphrases.index(keyphrase)}\2",
             text,
             flags=re.I,
-            count=1,
         )
     result = []
@@ -131,8 +130,7 @@ from a text. Since this is a time-consuming process, Artificial Intelligence is
 Currently, classical machine learning methods, that use statistics and linguistics, are widely used
 for the extraction process. The fact that these methods have been widely used in the community has
 the advantage that there are many easy-to-use libraries. Now with the recent innovations in
-deep learning methods (such as recurrent neural networks and transformers, GANS, …),
-keyphrase extraction can be improved. These new methods also focus on the semantics and
 context of a document, which is quite an improvement.
 This space gives you the ability to test around with some keyphrase extraction and generation models.

             rf"$K:{keyphrases.index(keyphrase)}\2",
             text,
             flags=re.I,
         )
     result = []
 Currently, classical machine learning methods, that use statistics and linguistics, are widely used
 for the extraction process. The fact that these methods have been widely used in the community has
 the advantage that there are many easy-to-use libraries. Now with the recent innovations in
+NLP, transformers can be used to improve keyphrase extraction. Transformers also focus on the semantics and
 context of a document, which is quite an improvement.
 This space gives you the ability to test around with some keyphrase extraction and generation models.

pipelines/__pycache__/keyphrase_extraction_pipeline.cpython-39.pyc CHANGED Viewed

Binary files a/pipelines/__pycache__/keyphrase_extraction_pipeline.cpython-39.pyc and b/pipelines/__pycache__/keyphrase_extraction_pipeline.cpython-39.pyc differ

pipelines/__pycache__/keyphrase_generation_pipeline.cpython-39.pyc CHANGED Viewed

Binary files a/pipelines/__pycache__/keyphrase_generation_pipeline.cpython-39.pyc and b/pipelines/__pycache__/keyphrase_generation_pipeline.cpython-39.pyc differ

pipelines/keyphrase_extraction_pipeline.py CHANGED Viewed

@@ -11,7 +11,9 @@ class KeyphraseExtractionPipeline(TokenClassificationPipeline):
     def __init__(self, model, *args, **kwargs):
         super().__init__(
             model=AutoModelForTokenClassification.from_pretrained(model),
-            tokenizer=AutoTokenizer.from_pretrained(model),
             *args,
             **kwargs
         )

     def __init__(self, model, *args, **kwargs):
         super().__init__(
             model=AutoModelForTokenClassification.from_pretrained(model),
+            tokenizer=AutoTokenizer.from_pretrained(
+                model, truncate=True
+            ),
             *args,
             **kwargs
         )

pipelines/keyphrase_generation_pipeline.py CHANGED Viewed

@@ -8,7 +8,7 @@ class KeyphraseGenerationPipeline(Text2TextGenerationPipeline):
     def __init__(self, model, keyphrase_sep_token=";", *args, **kwargs):
         super().__init__(
             model=AutoModelForSeq2SeqLM.from_pretrained(model),
-            tokenizer=AutoTokenizer.from_pretrained(model),
             *args,
             **kwargs
         )

     def __init__(self, model, keyphrase_sep_token=";", *args, **kwargs):
         super().__init__(
             model=AutoModelForSeq2SeqLM.from_pretrained(model),
+            tokenizer=AutoTokenizer.from_pretrained(model, truncate=True),
             *args,
             **kwargs
         )