Update README.md
Browse files
README.md
CHANGED
@@ -10,20 +10,20 @@ Implementation of the paper "How Many Layers and Why? An Analysis of the Model D
|
|
10 |
|
11 |
## Model architecture
|
12 |
|
13 |
-
We augment a multi-layer transformer encoder with a halting mechanism, which
|
14 |
We directly adapted this mechanism from Graves ([2016](#graves-2016)). At each iteration, we compute a probability for each token to stop updating its state.
|
15 |
|
16 |
## Model use
|
17 |
|
18 |
-
The architecture is not yet directly included in the Transformers library. So you
|
19 |
|
20 |
```bash
|
21 |
pip install git+https://github.com/AntoineSimoulin/adaptive-depth-transformers
|
22 |
```
|
23 |
|
24 |
-
Then
|
25 |
|
26 |
-
```
|
27 |
import sys
|
28 |
sys.path.append('adaptative-depth-transformers')
|
29 |
|
@@ -41,7 +41,6 @@ outputs.updates
|
|
41 |
# tensor([[[[15., 9., 10., 7., 3., 8., 5., 7., 12., 10., 6., 8., 8., 9., 5., 8.]]]])
|
42 |
```
|
43 |
|
44 |
-
|
45 |
## Citations
|
46 |
|
47 |
### BibTeX entry and citation info
|
|
|
10 |
|
11 |
## Model architecture
|
12 |
|
13 |
+
We augment a multi-layer transformer encoder with a halting mechanism, which dynamically adjusts the number of layers for each token.
|
14 |
We directly adapted this mechanism from Graves ([2016](#graves-2016)). At each iteration, we compute a probability for each token to stop updating its state.
|
15 |
|
16 |
## Model use
|
17 |
|
18 |
+
The architecture is not yet directly included in the Transformers library. The code used for pre-training is available in the following [github repository](https://github.com/AntoineSimoulin/adaptive-depth-transformers). So you should install the code implementation first:
|
19 |
|
20 |
```bash
|
21 |
pip install git+https://github.com/AntoineSimoulin/adaptive-depth-transformers
|
22 |
```
|
23 |
|
24 |
+
Then you can use the model directly.
|
25 |
|
26 |
+
```python
|
27 |
import sys
|
28 |
sys.path.append('adaptative-depth-transformers')
|
29 |
|
|
|
41 |
# tensor([[[[15., 9., 10., 7., 3., 8., 5., 7., 12., 10., 6., 8., 8., 9., 5., 8.]]]])
|
42 |
```
|
43 |
|
|
|
44 |
## Citations
|
45 |
|
46 |
### BibTeX entry and citation info
|