Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -22,21 +22,25 @@ tags:
|
|
22 |
This is a Natural Language Understanding (NLU) model for the French [MEDIA benchmark](https://catalogue.elra.info/en-us/repository/browse/ELRA-S0272/).
|
23 |
It maps each input words into outputs concepts tags (76 available).
|
24 |
|
25 |
-
This model is trained
|
26 |
|
27 |
## Available MEDIA NLU models:
|
28 |
-
- [`MEDIA_NLU-flaubert_base_cased`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_base_cased): MEDIA NLU model trained
|
29 |
-
- [`MEDIA_NLU-flaubert_base_uncased`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_base_uncased): MEDIA NLU model trained
|
30 |
-
- [`MEDIA_NLU-flaubert_oral_ft`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_oral_ft): MEDIA NLU model trained
|
31 |
-
- [`MEDIA_NLU-flaubert_oral_mixed`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_oral_mixed): MEDIA NLU model trained
|
32 |
-
- [`MEDIA_NLU-flaubert_oral_asr`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_oral_asr): MEDIA NLU model trained
|
33 |
-
- [`MEDIA_NLU-flaubert_oral_asr_nb`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_oral_asr_nb): MEDIA NLU model trained
|
34 |
|
35 |
## Usage with Pipeline
|
36 |
```python
|
37 |
from transformers import pipeline
|
38 |
|
39 |
-
generator = pipeline(
|
|
|
|
|
|
|
|
|
40 |
sentences = [
|
41 |
"je voudrais réserver une chambre à paris pour demain et lundi",
|
42 |
"d'accord pour l'hôtel à quatre vingt dix euros la nuit",
|
@@ -53,8 +57,12 @@ from transformers import (
|
|
53 |
AutoTokenizer,
|
54 |
AutoModelForTokenClassification
|
55 |
)
|
56 |
-
tokenizer = AutoTokenizer.from_pretrained(
|
57 |
-
|
|
|
|
|
|
|
|
|
58 |
|
59 |
sentences = [
|
60 |
"je voudrais réserver une chambre à paris pour demain et lundi",
|
@@ -64,7 +72,10 @@ sentences = [
|
|
64 |
]
|
65 |
inputs = tokenizer(sentences, padding=True, return_tensors='pt')
|
66 |
outptus = model(**inputs).logits
|
67 |
-
print([
|
|
|
|
|
|
|
68 |
```
|
69 |
|
70 |
## Reference
|
|
|
22 |
This is a Natural Language Understanding (NLU) model for the French [MEDIA benchmark](https://catalogue.elra.info/en-us/repository/browse/ELRA-S0272/).
|
23 |
It maps each input words into outputs concepts tags (76 available).
|
24 |
|
25 |
+
This model is trained using [`nherve/flaubert-oral-asr`](https://huggingface.co/nherve/flaubert-oral-asr) as its inital checkpoint.
|
26 |
|
27 |
## Available MEDIA NLU models:
|
28 |
+
- [`vpelloin/MEDIA_NLU-flaubert_base_cased`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_base_cased): MEDIA NLU model trained using [`flaubert/flaubert_base_cased`](https://huggingface.co/flaubert/flaubert_base_cased)
|
29 |
+
- [`vpelloin/MEDIA_NLU-flaubert_base_uncased`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_base_uncased): MEDIA NLU model trained using [`flaubert/flaubert_base_uncased`](https://huggingface.co/flaubert/flaubert_base_uncased)
|
30 |
+
- [`vpelloin/MEDIA_NLU-flaubert_oral_ft`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_oral_ft): MEDIA NLU model trained using [`nherve/flaubert-oral-ft`](https://huggingface.co/nherve/flaubert-oral-ft)
|
31 |
+
- [`vpelloin/MEDIA_NLU-flaubert_oral_mixed`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_oral_mixed): MEDIA NLU model trained using [`nherve/flaubert-oral-mixed`](https://huggingface.co/nherve/flaubert-oral-mixed)
|
32 |
+
- [`vpelloin/MEDIA_NLU-flaubert_oral_asr`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_oral_asr): MEDIA NLU model trained using [`nherve/flaubert-oral-asr`](https://huggingface.co/nherve/flaubert-oral-asr)
|
33 |
+
- [`vpelloin/MEDIA_NLU-flaubert_oral_asr_nb`](https://huggingface.co/vpelloin/MEDIA_NLU-flaubert_oral_asr_nb): MEDIA NLU model trained using [`nherve/flaubert-oral-asr_nb`](https://huggingface.co/nherve/flaubert-oral-asr_nb)
|
34 |
|
35 |
## Usage with Pipeline
|
36 |
```python
|
37 |
from transformers import pipeline
|
38 |
|
39 |
+
generator = pipeline(
|
40 |
+
model="vpelloin/MEDIA_NLU-flaubert_oral_asr",
|
41 |
+
task="token-classification"
|
42 |
+
)
|
43 |
+
|
44 |
sentences = [
|
45 |
"je voudrais réserver une chambre à paris pour demain et lundi",
|
46 |
"d'accord pour l'hôtel à quatre vingt dix euros la nuit",
|
|
|
57 |
AutoTokenizer,
|
58 |
AutoModelForTokenClassification
|
59 |
)
|
60 |
+
tokenizer = AutoTokenizer.from_pretrained(
|
61 |
+
"vpelloin/MEDIA_NLU-flaubert_oral_asr"
|
62 |
+
)
|
63 |
+
model = AutoModelForTokenClassification.from_pretrained(
|
64 |
+
"vpelloin/MEDIA_NLU-flaubert_oral_asr"
|
65 |
+
)
|
66 |
|
67 |
sentences = [
|
68 |
"je voudrais réserver une chambre à paris pour demain et lundi",
|
|
|
72 |
]
|
73 |
inputs = tokenizer(sentences, padding=True, return_tensors='pt')
|
74 |
outptus = model(**inputs).logits
|
75 |
+
print([
|
76 |
+
[model.config.id2label[i] for i in b]
|
77 |
+
for b in outptus.argmax(dim=-1).tolist()
|
78 |
+
])
|
79 |
```
|
80 |
|
81 |
## Reference
|