ArtifactAI commited on
Commit
9bed08f
·
1 Parent(s): 9439694

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +108 -0
README.md ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Table of Contents
2
+
3
+ 0. [TL;DR](#TL;DR)
4
+ 1. [Model Details](#model-details)
5
+ 2. [Usage](#usage)
6
+ 3. [Uses](#uses)
7
+ 4. [Citation](#citation)
8
+
9
+ # TL;DR
10
+
11
+ This is a FLAN-T5 model trained on [ArtifactAI/arxiv-cs-ml-instruct-tune-50k](ArtifactAI/arxiv-cs-ml-instruct-tune-50k). This model is for research purposes only and *should not be used in production settings*. The output is highly unreliable.
12
+
13
+ # Model Details
14
+
15
+ ## Model Description
16
+
17
+
18
+ - **Model type:** Language model
19
+ - **Language(s) (NLP):** English
20
+ - **License:** Apache 2.0
21
+ - **Related Models:** [All FLAN-T5 Checkpoints](https://huggingface.co/models?search=flan-t5)
22
+
23
+ # Usage
24
+
25
+ Find below some example scripts on how to use the model in `transformers`:
26
+
27
+ ## Using the Pytorch model
28
+
29
+ ### Running the model on a CPU
30
+
31
+ <details>
32
+ <summary> Click to expand </summary>
33
+
34
+ ```python
35
+
36
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
37
+
38
+ tokenizer = T5Tokenizer.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")
39
+ model = T5ForConditionalGeneration.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")
40
+
41
+ input_text = "What is an LSTM?"
42
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids
43
+
44
+ outputs = model.generate(input_ids)
45
+ print(tokenizer.decode(outputs[0]))
46
+ ```
47
+
48
+ </details>
49
+
50
+ ### Running the model on a GPU
51
+
52
+ <details>
53
+ <summary> Click to expand </summary>
54
+
55
+ ```python
56
+ # pip install accelerate
57
+ from transformers import T5Tokenizer, T5ForConditionalGeneration
58
+
59
+ tokenizer = T5Tokenizer.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")
60
+ model = T5ForConditionalGeneration.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering", device_map="auto")
61
+
62
+ input_text = "What is an LSTM?"
63
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
64
+
65
+ outputs = model.generate(input_ids)
66
+ print(tokenizer.decode(outputs[0]))
67
+ ```
68
+
69
+ </details>
70
+
71
+ ### Running the model in an HF pipeline
72
+
73
+ #### FP16
74
+
75
+ <details>
76
+ <summary> Click to expand </summary>
77
+
78
+ ```python
79
+ # load model and tokenizer from huggingface hub with pipeline
80
+ qa = pipeline("summarization", model="ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")
81
+
82
+
83
+ query = "what is an RNN?"
84
+ print(f"query: {query}")
85
+ res = qa("answer: " + query)
86
+
87
+ print(f"{res[0]['summary_text']}")
88
+
89
+ ```
90
+
91
+ </details>
92
+
93
+
94
+ # Training Details
95
+
96
+ ## Training Data
97
+
98
+ The model was trained on [ArtifactAI/arxiv-cs-ml-instruct-tune-50k](ArtifactAI/arxiv-cs-ml-instruct-tune-50k), a dataset of question/answer pairs. Questions are generated using the t5-base model, while the answers are generated using the GPT-3.5-turbo model.
99
+
100
+ # Citation
101
+
102
+ ```
103
+ @misc{flan-t5-base-arxiv-cs-ml-question-answering,
104
+ title={flan-t5-base-arxiv-cs-ml-question-answering},
105
+ author={Matthew Kenney},
106
+ year={2023}
107
+ }
108
+ ```