cointegrated commited on
Commit
ca99d68
·
1 Parent(s): ea03f7c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -3
README.md CHANGED
@@ -11,17 +11,20 @@ Thus, the vocabulary is 10% of the original, and number of parameters in the who
11
 
12
  To get the sentence embeddings, you can use the following code:
13
  ```python
 
14
  from transformers import AutoTokenizer, AutoModel
15
- tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/LaBSE")
16
- model = AutoModel.from_pretrained("sentence-transformers/LaBSE")
17
- sentences = ["Hello World", "Hallo Welt"]
18
  encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=64, return_tensors='pt')
19
  with torch.no_grad():
20
  model_output = model(**encoded_input)
21
  embeddings = model_output.pooler_output
22
  embeddings = torch.nn.functional.normalize(embeddings)
23
  print(embeddings)
 
24
 
25
  ## Reference:
26
  Fangxiaoyu Feng, Yinfei Yang, Daniel Cer, Narveen Ari, Wei Wang. [Language-agnostic BERT Sentence Embedding](https://arxiv.org/abs/2007.01852). July 2020
 
27
  License: [https://tfhub.dev/google/LaBSE/1](https://tfhub.dev/google/LaBSE/1)
 
11
 
12
  To get the sentence embeddings, you can use the following code:
13
  ```python
14
+ import torch
15
  from transformers import AutoTokenizer, AutoModel
16
+ tokenizer = AutoTokenizer.from_pretrained("cointegrated/LaBSE-en-ru")
17
+ model = AutoModel.from_pretrained("cointegrated/LaBSE-en-ru")
18
+ sentences = ["Hello World", "Привет Мир"]
19
  encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=64, return_tensors='pt')
20
  with torch.no_grad():
21
  model_output = model(**encoded_input)
22
  embeddings = model_output.pooler_output
23
  embeddings = torch.nn.functional.normalize(embeddings)
24
  print(embeddings)
25
+ ```
26
 
27
  ## Reference:
28
  Fangxiaoyu Feng, Yinfei Yang, Daniel Cer, Narveen Ari, Wei Wang. [Language-agnostic BERT Sentence Embedding](https://arxiv.org/abs/2007.01852). July 2020
29
+
30
  License: [https://tfhub.dev/google/LaBSE/1](https://tfhub.dev/google/LaBSE/1)