Update app.py
Browse files
app.py
CHANGED
@@ -1,104 +1,44 @@
|
|
1 |
import streamlit as st
|
2 |
|
3 |
st.markdown('''
|
4 |
-
| Welcome π |
|
5 |
-
| --- |
|
6 |
-
| π Add a bookmark for this classroom organization with examples and links for the session |
|
7 |
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
| π€ AI Pair Programming with GPT |
|
18 |
-
| 1οΈβ£ Open [ChatGPT](https://chat.openai.com/chat) and [Huggingface](https://huggingface.co/awacke1) in separate browser windows |
|
19 |
-
| 2οΈβ£ Use prompts to generate a streamlit program on Huggingface or locally |
|
20 |
-
| 3οΈβ£ For advanced work, add Python 3.10 and VSCode locally |
|
21 |
-
|
22 |
-
| YouTube University Method π₯ |
|
23 |
-
| --- |
|
24 |
-
| 1οΈβ£ Plan 2 hours each weekday for exercise and learning |
|
25 |
-
| 2οΈβ£ Create a YouTube playlist for learning, and watch at faster speed |
|
26 |
-
| 3οΈβ£ Practice note-taking in markdown |
|
27 |
-
|
28 |
-
| 2023 AI/ML Learning Playlists π |
|
29 |
-
| --- |
|
30 |
-
| 1οΈβ£ [AI News](https://www.youtube.com/playlist?list=PLHgX2IExbFotMOKWOErYeyHSiikf6RTeX) |
|
31 |
-
| 2οΈβ£ [ChatGPT Code Interpreter](https://www.youtube.com/playlist?list=PLHgX2IExbFou1pOQMayB7PArCalMWLfU-) |
|
32 |
-
| ... |
|
33 |
-
|
34 |
-
| Open Datasets for Health Care π₯« |
|
35 |
-
| --- |
|
36 |
-
| [Kaggle](https://www.kaggle.com/datasets), [NLM UMLS](https://www.nlm.nih.gov/research/umls/index.html), [LOINC](https://loinc.org/downloads/), [ICD10 Diagnosis](https://www.cms.gov/medicare/icd-10/2022-icd-10-cm), [ICD11](https://icd.who.int/dev11/downloads), [Papers,Code,Datasets for SOTA in Medicine](https://paperswithcode.com/datasets?q=medical&v=lst&o=newest), [Mental](https://paperswithcode.com/datasets?q=mental&v=lst&o=newest), [Behavior](https://paperswithcode.com/datasets?q=behavior&v=lst&o=newest), [CMS Downloads](https://www.cms.gov/medicare-coverage-database/downloads/downloads.aspx), [CMS CPT and HCPCS Procedures and Services](https://www.cms.gov/medicare/fraud-and-abuse/physicianselfreferral/list_of_codes) |
|
37 |
-
|
38 |
-
| Azure Development Architectures in 2023 π οΈ |
|
39 |
-
| --- |
|
40 |
-
| [ChatGPT](https://azure.github.io/awesome-azd/?tags=chatgpt), [Azure OpenAI Services](https://azure.github.io/awesome-azd/?tags=openai), [Python](https://azure.github.io/awesome-azd/?tags=python), [AI LLM Architecture - Guidance by MS](https://github.com/microsoft/guidance) |
|
41 |
-
|
42 |
-
| Starter Prompts for AIPP ποΈ |
|
43 |
-
| --- |
|
44 |
-
| Write a streamlit program that demonstrates Data synthesis... |
|
45 |
-
| Synthesize data from multiple sources to create new datasets... |
|
46 |
-
| Use two datasets and demonstrate pandas dataframe query merge and join... |
|
47 |
-
|
48 |
-
| Large Language Models π§ |
|
49 |
-
| --- |
|
50 |
-
| BigScience-tr11-176B - 176 billion |
|
51 |
-
| GPT-3 - 175 billion |
|
52 |
-
| ... |
|
53 |
-
|
54 |
-
| ChatGPT Datasets π |
|
55 |
-
| --- |
|
56 |
-
| WebText, Common Crawl, BooksCorpus, English Wikipedia, Toronto Books Corpus, OpenWebText |
|
57 |
-
|
58 |
-
| Big Science Model π |
|
59 |
-
| --- |
|
60 |
-
| [BLOOM: A 176B-Parameter
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
''')
|
65 |
-
|
66 |
-
st.markdown('''
|
67 |
-
|
68 |
-
|
69 |
-
|
70 |
-
Rewrite this content as markdown tables with emojis. Wherever possible rearrange the content to cluster around like topics and the examples list. Reduce the content in size significantly. Feature the content as markdown tables only and show markdown code. Content: Welcome - This classroom organization holds examples and links for this session.
|
71 |
Begin by adding a bookmark.
|
72 |
|
73 |
# Examples and Exercises - Create These Spaces in Your Account and Test / Modify
|
74 |
|
75 |
## Easy Examples
|
76 |
-
1. FastSpeech
|
77 |
-
2. Memory
|
78 |
-
3. StaticHTML5PlayCanvas
|
79 |
-
4. 3DHuman
|
80 |
-
5. TranscriptAILearnerFromYoutube
|
81 |
-
6. AnimatedGifGallery
|
82 |
-
7. VideoToAnimatedGif
|
83 |
|
84 |
## Hard Examples:
|
85 |
-
8. ChatGPTandLangChain
|
86 |
-
a. Keys: https://platform.openai.com/account/api-keys
|
87 |
-
9. MultiPDFQAChatGPTLangchain
|
88 |
-
|
89 |
|
90 |
# π Two easy ways to turbo boost your AI learning journey - Lets go 100X! π»
|
91 |
|
92 |
# π AI Pair Programming with GPT
|
93 |
### Open 2 Browsers to:
|
94 |
-
1.
|
95 |
-
2.
|
96 |
1. π€ Use prompts to generate a streamlit program on Huggingface or locally to test it.
|
97 |
2. π§ For advanced work, add Python 3.10 and VSCode locally, and debug as gradio or streamlit apps.
|
98 |
3. π Use these two superpower processes to reduce the time it takes you to make a new AI program! β±οΈ
|
99 |
|
100 |
-
|
101 |
-
|
102 |
# π₯ YouTube University Method:
|
103 |
1. ποΈββοΈ Plan two hours each weekday to exercise your body and brain.
|
104 |
2. π¬ Make a playlist of videos you want to learn from on YouTube. Save the links to edit later.
|
@@ -107,156 +47,82 @@ Begin by adding a bookmark.
|
|
107 |
5. π Practice note-taking in markdown to instantly save what you want to remember. Share your notes with others!
|
108 |
6. π₯ AI Pair Programming Using Long Answer Language Models with Human Feedback
|
109 |
|
110 |
-
|
111 |
## π₯ 2023 AI/ML Learning Playlists for ChatGPT, LLMs, Recent Events in AI:
|
112 |
-
1. AI News
|
113 |
-
2. ChatGPT Code Interpreter
|
114 |
-
3. Ilya Sutskever and Sam Altman
|
115 |
-
4. Andrew Huberman on Neuroscience and Health
|
116 |
-
5. Andrej Karpathy
|
117 |
-
6. Medical Futurist on GPT
|
118 |
-
7. ML APIs
|
119 |
-
|
120 |
-
|
121 |
-
|
122 |
-
|
123 |
-
|
124 |
-
|
125 |
-
|
126 |
-
|
127 |
-
|
128 |
-
|
129 |
-
|
130 |
-
|
131 |
-
|
132 |
-
|
133 |
-
|
134 |
-
|
135 |
-
|
136 |
-
|
137 |
-
|
138 |
-
|
139 |
-
|
140 |
-
|
141 |
-
|
142 |
-
|
143 |
-
|
144 |
-
|
145 |
-
|
146 |
-
|
147 |
-
|
148 |
-
|
149 |
-
|
150 |
-
|
151 |
-
|
152 |
-
|
153 |
-
|
154 |
-
|
155 |
-
|
156 |
-
|
157 |
-
|
158 |
-
|
159 |
-
|
160 |
-
|
161 |
-
|
162 |
-
|
163 |
-
|
164 |
-
|
165 |
-
|
166 |
-
|
167 |
-
|
|
|
|
|
|
|
168 |
| BigScience-tr11-176B | 176 billion |
|
169 |
-
| GPT-3
|
170 |
-
| OpenAI's DALL-E 2.0 | 500 million
|
171 |
-
| NVIDIA's Megatron | 8.3 billion
|
172 |
-
| Transformer-XL
|
173 |
-
| XLNet
|
174 |
-
|
175 |
-
|
176 |
-
|
177 |
-
|
178 |
-
-
|
179 |
-
-
|
180 |
-
|
181 |
-
|
182 |
-
|
183 |
-
|
184 |
-
- **WebText:** A dataset of web pages crawled from domains on the Alexa top 5,000 list. This dataset was used to pretrain GPT-2.
|
185 |
-
- [WebText: A Large-Scale Unsupervised Text Corpus by Radford et al.](https://paperswithcode.com/dataset/webtext)
|
186 |
-
- **Common Crawl:** A dataset of web pages from a variety of domains, which is updated regularly. This dataset was used to pretrain GPT-3.
|
187 |
-
- [Language Models are Few-Shot Learners](https://paperswithcode.com/dataset/common-crawl) by Brown et al.
|
188 |
-
- **BooksCorpus:** A dataset of over 11,000 books from a variety of genres.
|
189 |
-
- [Scalable Methods for 8 Billion Token Language Modeling](https://paperswithcode.com/dataset/bookcorpus) by Zhu et al.
|
190 |
-
- **English Wikipedia:** A dump of the English-language Wikipedia as of 2018, with articles from 2001-2017.
|
191 |
-
- [Improving Language Understanding by Generative Pre-Training](https://huggingface.co/spaces/awacke1/WikipediaUltimateAISearch?logs=build) Space for Wikipedia Search
|
192 |
-
- **Toronto Books Corpus:** A dataset of over 7,000 books from a variety of genres, collected by the University of Toronto.
|
193 |
-
- [Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond](https://paperswithcode.com/dataset/bookcorpus) by Schwenk and Douze.
|
194 |
-
- **OpenWebText:** A dataset of web pages that were filtered to remove content that was likely to be low-quality or spammy. This dataset was used to pretrain GPT-3.
|
195 |
-
- [Language Models are Few-Shot Learners](https://paperswithcode.com/dataset/openwebtext) by Brown et al.
|
196 |
-
|
197 |
-
## Big Science Model π
|
198 |
-
- π Papers:
|
199 |
-
1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [Paper](https://arxiv.org/abs/2211.05100)
|
200 |
-
2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [Paper](https://arxiv.org/abs/1909.08053)
|
201 |
-
3. 8-bit Optimizers via Block-wise Quantization [Paper](https://arxiv.org/abs/2110.02861)
|
202 |
-
4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation [Paper](https://arxiv.org/abs/2108.12409)
|
203 |
-
5. [Other papers related to Big Science](https://huggingface.co/models?other=doi:10.57967/hf/0003)
|
204 |
-
6. [217 other models optimized for use with Bloom](https://huggingface.co/models?other=bloom)
|
205 |
-
|
206 |
-
- π Datasets:
|
207 |
-
|
208 |
-
**Datasets:**
|
209 |
-
1. - **Universal Dependencies:** A collection of annotated corpora for natural language processing in a range of languages, with a focus on dependency parsing.
|
210 |
-
- [Universal Dependencies official website.](https://universaldependencies.org/)
|
211 |
-
2. - **WMT 2014:** The fourth edition of the Workshop on Statistical Machine Translation, featuring shared tasks on translating between English and various other languages.
|
212 |
-
- [WMT14 website.](http://www.statmt.org/wmt14/)
|
213 |
-
3. - **The Pile:** An English language corpus of diverse text, sourced from various places on the internet.
|
214 |
-
- [The Pile official website.](https://pile.eleuther.ai/)
|
215 |
-
4. - **HumanEval:** A dataset of English sentences, annotated with human judgments on a range of linguistic qualities.
|
216 |
-
- [HumanEval: An Evaluation Benchmark for Language Understanding](https://github.com/google-research-datasets/humaneval) by Gabriel Ilharco, Daniel Loureiro, Pedro Rodriguez, and Afonso Mendes.
|
217 |
-
5. - **FLORES-101:** A dataset of parallel sentences in 101 languages, designed for multilingual machine translation.
|
218 |
-
- [FLORES-101: A Massively Multilingual Parallel Corpus for Language Understanding](https://flores101.opennmt.net/) by Aman Madaan, Shruti Rijhwani, Raghav Gupta, and Mitesh M. Khapra.
|
219 |
-
6. - **CrowS-Pairs:** A dataset of sentence pairs, designed for evaluating the plausibility of generated text.
|
220 |
-
- [CrowS-Pairs: A Challenge Dataset for Plausible Plausibility Judgments](https://github.com/stanford-cogsci/crows-pairs) by Andrea Madotto, Zhaojiang Lin, Chien-Sheng Wu, Pascale Fung, and Caiming Xiong.
|
221 |
-
7. - **WikiLingua:** A dataset of parallel sentences in 75 languages, sourced from Wikipedia.
|
222 |
-
- [WikiLingua: A New Benchmark Dataset for Cross-Lingual Wikification](https://arxiv.org/abs/2105.08031) by Jiarui Yao, Yanqiao Zhu, Ruihan Bao, Guosheng Lin, Lidong Bing, and Bei Shi.
|
223 |
-
8. - **MTEB:** A dataset of English sentences, annotated with their entailment relationships with respect to other sentences.
|
224 |
-
- [Multi-Task Evaluation Benchmark for Natural Language Inference](https://github.com/google-research-datasets/mteb) by MichaΕ Lukasik, Marcin Junczys-Dowmunt, and Houda Bouamor.
|
225 |
-
9. - **xP3:** A dataset of English sentences, annotated with their paraphrase relationships with respect to other sentences.
|
226 |
-
- [xP3: A Large-Scale Evaluation Benchmark for Paraphrase Identification in Context](https://github.com/nyu-dl/xp3) by Aniket Didolkar, James Mayfield, Markus Saers, and Jason Baldridge.
|
227 |
-
10. - **DiaBLa:** A dataset of English dialogue, annotated with dialogue acts.
|
228 |
-
- [A Large-Scale Corpus for Conversation Disentanglement](https://github.com/HLTCHKUST/DiaBLA) by Samuel Broscheit, AntΓ³nio Branco, and AndrΓ© F. T. Martins.
|
229 |
-
|
230 |
-
- π Dataset Papers with Code
|
231 |
-
1. [Universal Dependencies](https://paperswithcode.com/dataset/universal-dependencies)
|
232 |
-
2. [WMT 2014](https://paperswithcode.com/dataset/wmt-2014)
|
233 |
-
3. [The Pile](https://paperswithcode.com/dataset/the-pile)
|
234 |
-
4. [HumanEval](https://paperswithcode.com/dataset/humaneval)
|
235 |
-
5. [FLORES-101](https://paperswithcode.com/dataset/flores-101)
|
236 |
-
6. [CrowS-Pairs](https://paperswithcode.com/dataset/crows-pairs)
|
237 |
-
7. [WikiLingua](https://paperswithcode.com/dataset/wikilingua)
|
238 |
-
8. [MTEB](https://paperswithcode.com/dataset/mteb)
|
239 |
-
9. [xP3](https://paperswithcode.com/dataset/xp3)
|
240 |
-
10. [DiaBLa](https://paperswithcode.com/dataset/diabla)
|
241 |
-
|
242 |
-
# Deep RL ML Strategy π§
|
243 |
-
The AI strategies are:
|
244 |
-
- Language Model Preparation using Human Augmented with Supervised Fine Tuning π€
|
245 |
-
- Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank π
|
246 |
-
- Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score π―
|
247 |
-
- Proximal Policy Optimization Fine Tuning π€
|
248 |
-
- Variations - Preference Model Pretraining π€
|
249 |
-
- Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution π
|
250 |
-
- Online Version Getting Feedback π¬
|
251 |
-
- OpenAI - InstructGPT - Humans generate LM Training Text π
|
252 |
-
- DeepMind - Advantage Actor Critic Sparrow, GopherCite π¦
|
253 |
-
- Reward Model Human Prefence Feedback π
|
254 |
-
|
255 |
-
For more information on specific techniques and implementations, check out the following resources:
|
256 |
-
- OpenAI's paper on [GPT-3](https://arxiv.org/abs/2005.14165) which details their Language Model Preparation approach
|
257 |
-
- DeepMind's paper on [SAC](https://arxiv.org/abs/1801.01290) which describes the Advantage Actor Critic algorithm
|
258 |
-
- OpenAI's paper on [Reward Learning](https://arxiv.org/abs/1810.06580) which explains their approach to training Reward Models
|
259 |
-
- OpenAI's blog post on [GPT-3's fine-tuning process](https://openai.com/blog/fine-tuning-gpt-3/)
|
260 |
|
261 |
|
262 |
''')
|
|
|
1 |
import streamlit as st
|
2 |
|
3 |
st.markdown('''
|
|
|
|
|
|
|
4 |
|
5 |
+
---
|
6 |
+
title: README
|
7 |
+
emoji: π
|
8 |
+
colorFrom: pink
|
9 |
+
colorTo: blue
|
10 |
+
sdk: static
|
11 |
+
pinned: false
|
12 |
+
---
|
13 |
+
Welcome - This classroom organization holds examples and links for this session.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
Begin by adding a bookmark.
|
15 |
|
16 |
# Examples and Exercises - Create These Spaces in Your Account and Test / Modify
|
17 |
|
18 |
## Easy Examples
|
19 |
+
1. [FastSpeech](https://huggingface.co/spaces/AIZero2HeroBootcamp/FastSpeech2LinerGradioApp)
|
20 |
+
2. [Memory](https://huggingface.co/spaces/AIZero2HeroBootcamp/Memory)
|
21 |
+
3. [StaticHTML5PlayCanvas](https://huggingface.co/spaces/AIZero2HeroBootcamp/StaticHTML5Playcanvas)
|
22 |
+
4. [3DHuman](https://huggingface.co/spaces/AIZero2HeroBootcamp/3DHuman)
|
23 |
+
5. [TranscriptAILearnerFromYoutube](https://huggingface.co/spaces/AIZero2HeroBootcamp/TranscriptAILearnerFromYoutube)
|
24 |
+
6. [AnimatedGifGallery](https://huggingface.co/spaces/AIZero2HeroBootcamp/AnimatedGifGallery)
|
25 |
+
7. [VideoToAnimatedGif](https://huggingface.co/spaces/AIZero2HeroBootcamp/VideoToAnimatedGif)
|
26 |
|
27 |
## Hard Examples:
|
28 |
+
8. [ChatGPTandLangChain](https://huggingface.co/spaces/AIZero2HeroBootcamp/ChatGPTandLangchain)
|
29 |
+
a. Keys: [API Keys](https://platform.openai.com/account/api-keys)
|
30 |
+
9. [MultiPDFQAChatGPTLangchain](https://huggingface.co/spaces/AIZero2HeroBootcamp/MultiPDF-QA-ChatGPT-Langchain)
|
|
|
31 |
|
32 |
# π Two easy ways to turbo boost your AI learning journey - Lets go 100X! π»
|
33 |
|
34 |
# π AI Pair Programming with GPT
|
35 |
### Open 2 Browsers to:
|
36 |
+
1. π [ChatGPT](https://chat.openai.com/chat) or [URL2](https://platform.openai.com/playground) and
|
37 |
+
2. π [Huggingface](https://huggingface.co/awacke1) in separate browser windows.
|
38 |
1. π€ Use prompts to generate a streamlit program on Huggingface or locally to test it.
|
39 |
2. π§ For advanced work, add Python 3.10 and VSCode locally, and debug as gradio or streamlit apps.
|
40 |
3. π Use these two superpower processes to reduce the time it takes you to make a new AI program! β±οΈ
|
41 |
|
|
|
|
|
42 |
# π₯ YouTube University Method:
|
43 |
1. ποΈββοΈ Plan two hours each weekday to exercise your body and brain.
|
44 |
2. π¬ Make a playlist of videos you want to learn from on YouTube. Save the links to edit later.
|
|
|
47 |
5. π Practice note-taking in markdown to instantly save what you want to remember. Share your notes with others!
|
48 |
6. π₯ AI Pair Programming Using Long Answer Language Models with Human Feedback
|
49 |
|
|
|
50 |
## π₯ 2023 AI/ML Learning Playlists for ChatGPT, LLMs, Recent Events in AI:
|
51 |
+
1. [AI News](https://www.youtube.com/playlist?list=PLHgX2IExbFotMOKWOErYeyHSiikf6RTeX)
|
52 |
+
2. [ChatGPT Code Interpreter](https://www.youtube.com/playlist?list=PLHgX2IExbFou1pOQMayB7PArCalMWLfU-)
|
53 |
+
3. [Ilya Sutskever and Sam Altman](https://www.youtube.com/playlist?list=PLHgX2IExbFovr66KW6Mqa456qyY-Vmvw-)
|
54 |
+
4. [Andrew Huberman on Neuroscience and Health](https://www.youtube.com/playlist?list=PLHgX2IExbFotRU0jl_a0e0mdlYU-NWy1r)
|
55 |
+
5. [Andrej Karpathy](https://www.youtube.com/playlist?list=PLHgX2IExbFovbOFCgLNw1hRutQQKrfYNP)
|
56 |
+
6. [Medical Futurist on GPT](https://www.youtube.com/playlist?list=PLHgX2IExbFosVaCMZCZ36bYqKBYqFKHB2)
|
57 |
+
7. [ML APIs](https://www.youtube.com/playlist?list=PLHg
|
58 |
+
|
59 |
+
- π Source Code:
|
60 |
+
1. [BigScience (GitHub)](https://github.com/bigscience-workshop/bigscience)
|
61 |
+
|
62 |
+
## π GPT-3 Performance:
|
63 |
+
|
64 |
+
- GPT-3, while less performant than BigScience, has found widespread use due to its availability through the OpenAI API, making it easier for developers to incorporate the model into their applications without requiring substantial computational resources.
|
65 |
+
- While the GPT-3 model has 175 billion parameters, its performance is considered slightly less than the newer BigScience model. However, the specific performance of each model can vary depending on the task.
|
66 |
+
|
67 |
+
## DALL-E 2.0 Overview π¨
|
68 |
+
|
69 |
+
- DALL-E 2.0 is an AI model developed by OpenAI that generates images from textual descriptions.
|
70 |
+
- It has 500 million parameters and uses a dataset curated by OpenAI, consisting of a diverse range of images from the internet.
|
71 |
+
|
72 |
+
## NVIDIA's Megatron Overview π‘
|
73 |
+
|
74 |
+
- Megatron is a large-scale transformer model developed by NVIDIA. It's primarily designed for tasks that require understanding the context of large pieces of text.
|
75 |
+
- It has 8.3 billion parameters and is trained on a variety of text data from the internet.
|
76 |
+
|
77 |
+
## Transformer-XL Overview β‘οΈ
|
78 |
+
|
79 |
+
- Transformer-XL is an AI model developed by Google Brain, which introduces a novel recurrence mechanism and relative positional encoding scheme.
|
80 |
+
- It has 250 million parameters and uses a variety of datasets for training, including BooksCorpus and English Wikipedia.
|
81 |
+
|
82 |
+
## XLNet Overview π
|
83 |
+
|
84 |
+
- XLNet is a generalized autoregressive model that outperforms BERT on several benchmarks.
|
85 |
+
- It has 210 million parameters and uses a variety of datasets for training, including BooksCorpus and English Wikipedia.
|
86 |
+
|
87 |
+
<h1><center>πAI Model Comparisonπ</center></h1>
|
88 |
+
|
89 |
+
| Model Name | Model Size (in Parameters) | Model Overview |
|
90 |
+
| --- | --- | --- |
|
91 |
+
| BigScience-tr11-176B | 176 billion | BigScience is the latest AI model developed by the Big Science Workshop. It has 176 billion parameters and uses a combination of text data from the internet and scientific literature for training. |
|
92 |
+
| GPT-3 | 175 billion | GPT-3 is an AI model developed by OpenAI, which has 175 billion parameters and uses a variety of datasets for training, including Common Crawl, BooksCorpus, and English Wikipedia. |
|
93 |
+
| OpenAI's DALL-E 2.0 | 500 million | DALL-E 2.0 is an AI model developed by OpenAI that generates images from textual descriptions. It has 500 million parameters and uses a dataset curated by OpenAI. |
|
94 |
+
| NVIDIA's Megatron | 8.3 billion | Megatron is a large-scale transformer model developed by NVIDIA. It's primarily designed for tasks that require understanding the context of large pieces of text. |
|
95 |
+
| Transformer-XL | 250 million | Transformer-XL is an AI model developed by Google Brain, which introduces a novel recurrence mechanism and relative positional encoding scheme. |
|
96 |
+
| XLNet | 210 million | XLNet is a generalized autoregressive model that outperforms BERT on several benchmarks. |
|
97 |
+
|
98 |
+
## References:
|
99 |
+
|
100 |
+
1. [BigScience - A 176B-Parameter Open-Access Multilingual Language Model](https://arxiv.org/abs/2211.05100)
|
101 |
+
2. [GPT-3 - Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
|
102 |
+
3. [DALL-E 2.0 - Generative Pretraining from Pixels](https://openai.com/research/dall-e/)
|
103 |
+
4. [Megatron - Training Multi-Billion Parameter Language Models Using GPU Model Parallelism](https://arxiv.org/abs/1909.08053)
|
104 |
+
5. [Transformer-XL - Transformers with Longer-Range Dependencies](https://arxiv.org/abs/1901.02860)
|
105 |
+
6. [XLNet - Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237)
|
106 |
+
|
107 |
+
|
108 |
+
| Model Name | Model Size (in Parameters) |
|
109 |
+
| --- | --- |
|
110 |
| BigScience-tr11-176B | 176 billion |
|
111 |
+
| GPT-3 | 175 billion |
|
112 |
+
| OpenAI's DALL-E 2.0 | 500 million |
|
113 |
+
| NVIDIA's Megatron | 8.3 billion |
|
114 |
+
| Transformer-XL | 250 million |
|
115 |
+
| XLNet | 210 million |
|
116 |
+
|
117 |
+
|
118 |
+
| Model Name | Model Size (in Parameters) | Model Overview |
|
119 |
+
| --- | --- | --- |
|
120 |
+
| BigScience-tr11-176B | 176 billion | Uses a combination of text data from the internet and scientific literature for training. |
|
121 |
+
| GPT-3 | 175 billion | Uses a variety of datasets for training, including Common Crawl,
|
122 |
+
|
123 |
+
|
124 |
+
|
125 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
126 |
|
127 |
|
128 |
''')
|