readme draft 2
Browse files
README.md
CHANGED
@@ -140,35 +140,55 @@ tokens:
|
|
140 |
|
141 |
# 📊 Datasets
|
142 |
|
|
|
|
|
|
|
|
|
143 |
Following datasets were used in this model:
|
144 |
|
145 |
-
- [MATH](https://huggingface.co/datasets/dahendrycks/competition_math)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
146 |
|
147 |
-
- [
|
148 |
|
149 |
-
- [
|
150 |
|
151 |
-
- [
|
152 |
|
153 |
-
- [
|
154 |
|
155 |
-
- [
|
156 |
|
157 |
-
- [
|
158 |
|
159 |
-
- [
|
160 |
|
161 |
-
- [
|
162 |
|
163 |
-
|
164 |
|
165 |
-
|
166 |
|
167 |
-
|
168 |
|
169 |
-
|
170 |
|
171 |
-
- [
|
|
|
|
|
|
|
|
|
|
|
|
|
172 |
|
173 |
# 💬 Prompt Template
|
174 |
|
@@ -193,14 +213,20 @@ tokens = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
|
|
193 |
|
194 |
# 🤝 Acknowledgments
|
195 |
|
196 |
-
Thanks to [
|
|
|
|
|
197 |
|
198 |
Thanks to [Together AI](https://www.together.ai) for providing everyone with free credits, which I used to generate a dataset in multiple choice to explanations format.
|
199 |
|
|
|
|
|
200 |
Thanks to all the dataset authors mentioned in the datasets section.
|
201 |
|
202 |
Thanks to [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for making the repository I used to make this model.
|
203 |
|
|
|
|
|
204 |
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
|
205 |
|
206 |
If you would like to support me:
|
|
|
140 |
|
141 |
# 📊 Datasets
|
142 |
|
143 |
+
You can find the dataset I used and the work I am doing with this datasets here:
|
144 |
+
|
145 |
+
https://huggingface.co/datasets/Weyaxi/sci-datasets
|
146 |
+
|
147 |
Following datasets were used in this model:
|
148 |
|
149 |
+
- 📐 [MATH](https://huggingface.co/datasets/dahendrycks/competition_math)
|
150 |
+
|
151 |
+
- 🧠 [ARC](https://huggingface.co/datasets/allenai/ai2_arc) (Note: Only **train** part)
|
152 |
+
|
153 |
+
- 🧲 [camel-ai/physics](https://huggingface.co/datasets/camel-ai/physics)
|
154 |
+
|
155 |
+
- ⚗️ [camel-ai/chemistry](https://huggingface.co/datasets/camel-ai/chemistry)
|
156 |
+
|
157 |
+
- 🦠 [camel-ai/biology](https://huggingface.co/datasets/camel-ai/biology)
|
158 |
+
|
159 |
+
- 📊 [camel-ai/math](https://huggingface.co/datasets/camel-ai/math)
|
160 |
|
161 |
+
- ⚡ [STEM-AI-mtl/Electrical-engineering](https://huggingface.co/datasets/STEM-AI-mtl/Electrical-engineering)
|
162 |
|
163 |
+
- 📚 [openbookqa](https://huggingface.co/datasets/openbookqa)
|
164 |
|
165 |
+
- 🧠 [piqa](https://huggingface.co/datasets/piqa)
|
166 |
|
167 |
+
- 🎨 [reclor](https://huggingface.co/datasets/metaeval/reclor)
|
168 |
|
169 |
+
- 🔬 [scibench](https://github.com/mandyyyyii/scibench)
|
170 |
|
171 |
+
- 🧪 [ScienceQA](https://huggingface.co/datasets/derek-thomas/ScienceQA)
|
172 |
|
173 |
+
- 🧬 [sciq](https://huggingface.co/datasets/sciq)
|
174 |
|
175 |
+
- 📝 [ScienceEval](https://huggingface.co/datasets/TIGER-Lab/ScienceEval)
|
176 |
|
177 |
+
## 🛠️ Multiple Choice Question & Answer Datasets Conversion Progress
|
178 |
|
179 |
+
I used [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) to generate a reasonable and logical answer by providing it with the question and the answer key.
|
180 |
|
181 |
+
I used the [Together AI](https://www.together.ai) API for this task.
|
182 |
|
183 |
+
The following datasets are converted using this method:
|
184 |
|
185 |
+
- 🧠 [ARC](https://huggingface.co/datasets/allenai/ai2_arc) (Note: Only **train** part)
|
186 |
+
|
187 |
+
- 📚 [openbookqa](https://huggingface.co/datasets/openbookqa)
|
188 |
+
|
189 |
+
- 🎨 [reclor](https://huggingface.co/datasets/metaeval/reclor)
|
190 |
+
|
191 |
+
- 🧬 [sciq](https://huggingface.co/datasets/sciq)
|
192 |
|
193 |
# 💬 Prompt Template
|
194 |
|
|
|
213 |
|
214 |
# 🤝 Acknowledgments
|
215 |
|
216 |
+
Thanks to [openchat](https://huggingface.co/openchat) team for fine-tuning an excellent model that I used as a base model.
|
217 |
+
|
218 |
+
Thanks to [@jondurbin](https://huggingface.co/jondurbin) for reformatting codes for some datasets: [bagel/data_sources](https://github.com/jondurbin/bagel/tree/main/bagel/data_sources)
|
219 |
|
220 |
Thanks to [Together AI](https://www.together.ai) for providing everyone with free credits, which I used to generate a dataset in multiple choice to explanations format.
|
221 |
|
222 |
+
Thanks to [Tim Dettmers](https://huggingface.co/timdettmers) for his excellent [QLoRA](https://arxiv.org/abs/2305.14314) work.
|
223 |
+
|
224 |
Thanks to all the dataset authors mentioned in the datasets section.
|
225 |
|
226 |
Thanks to [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for making the repository I used to make this model.
|
227 |
|
228 |
+
Overall, thanks to all of the open soure AI community! 🚀
|
229 |
+
|
230 |
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
|
231 |
|
232 |
If you would like to support me:
|