Sean MacAvaney commited on
Commit
fcfbc2c
Β·
1 Parent(s): 9fc16dc
Files changed (5) hide show
  1. README.md +21 -6
  2. app.py +37 -0
  3. packages.txt +5 -0
  4. requirements.txt +5 -0
  5. wrapup.md +4 -0
README.md CHANGED
@@ -1,12 +1,27 @@
1
  ---
2
- title: Monot5
3
- emoji: πŸ‘
4
- colorFrom: yellow
5
- colorTo: yellow
6
  sdk: gradio
7
- sdk_version: 3.8.2
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: PyTerrier MonoT5
3
+ emoji: πŸ•
4
+ colorFrom: green
5
+ colorTo: green
6
  sdk: gradio
7
+ sdk_version: 3.7
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
+ # πŸ• PyTerrier: MonoT5
13
+
14
+ This is a demonstration of [PyTerrier's T5 package](https://github.com/terrierteam/pyterrier_t5).
15
+
16
+ MonoT5 functions as a `R→R` (reranking, result-to-result) transformer and can be used in pipelines accordingly. For example, you will
17
+ often pipe the output of a first-stage retrieval function into MonoT5:
18
+
19
+ <div class="pipeline">
20
+ <div class="df" title="Query Frame">Q</div>
21
+ <div class="transformer" title="PisaRetrieve Transformer">TerrierRetrieve</div>
22
+ <div class="df" title="Result Frame">R</div>
23
+ <div class="transformer" title="get_text Transformer">get_text</div>
24
+ <div class="df" title="Result Frame">R</div>
25
+ <div class="transformer attn" title="MonoT5 Transformer">MonoT5</div>
26
+ <div class="df" title="Result Frame">R</div>
27
+ </div>
app.py ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pandas as pd
2
+ import gradio as gr
3
+ import pyterrier as pt
4
+ pt.init()
5
+ from pyterrier_gradio import Demo, MarkdownFile, interface, df2code, code2md, EX_R
6
+ from pyterrier_t5 import MonoT5ReRanker
7
+
8
+ model = MonoT5ReRanker()
9
+
10
+ COLAB_NAME = 'pyterrier_t5.ipynb'
11
+ COLAB_INSTALL = '''
12
+ !pip install -q git+https://github.com/terrier-org/pyterrier_t5
13
+ '''.strip()
14
+
15
+ def predict(input):
16
+ code = f'''import pandas as pd
17
+ import pyterrier as pt ; pt.init()
18
+ from pyterrier_t5 import MonoT5ReRanker
19
+
20
+ model = MonoT5ReRanker()
21
+
22
+ model({df2code(input)})
23
+ '''
24
+ res = model(input)
25
+ res['score'] = res['score'].map(lambda x: round(x, 4))
26
+ res = res.sort_values(['qid', 'rank'])
27
+ return (res, code2md(code, COLAB_INSTALL, COLAB_NAME, colab=False))
28
+
29
+ interface(
30
+ MarkdownFile('README.md'),
31
+ Demo(
32
+ predict,
33
+ EX_R,
34
+ []
35
+ ),
36
+ MarkdownFile('wrapup.md'),
37
+ ).launch(share=False)
packages.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ openjdk-11-jdk
2
+ openjdk-11-jre-headless
3
+ openjdk-11-jre
4
+ openjdk-11-jre-headless
5
+ debianutils
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ git+https://github.com/seanmacavaney/[email protected]
2
+ git+https://github.com/terrier-org/pyterrier
3
+ git+https://github.com/terrier-org/pyterrier_t5
4
+ ir_datasets
5
+ ir_measures
wrapup.md ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ ### References & Credits
2
+
3
+ - Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin. [The Expando-Mono-Duo Design Pattern for Text Ranking withPretrained Sequence-to-Sequence Models.](https://arxiv.org/pdf/2101.05667.pdf)
4
+ - Craig Macdonald, Nicola Tonellotto, Sean MacAvaney, Iadh Ounis. [PyTerrier: Declarative Experimentation in Python from BM25 to Dense Retrieval](https://dl.acm.org/doi/abs/10.1145/3459637.3482013). CIKM 2021.