awacke1 commited on
Commit
ddf1b18
Β·
1 Parent(s): a94f2f5

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +95 -229
app.py CHANGED
@@ -1,104 +1,44 @@
1
  import streamlit as st
2
 
3
  st.markdown('''
4
- | Welcome πŸŽ“ |
5
- | --- |
6
- | πŸ“Œ Add a bookmark for this classroom organization with examples and links for the session |
7
 
8
- | Create and Test πŸ”§ | Easy Examples πŸ”° | Hard Examples πŸ’  |
9
- | --- | --- | --- |
10
- | 1️⃣ [FastSpeech](https://huggingface.co/spaces/AIZero2HeroBootcamp/FastSpeech2LinerGradioApp) | 8️⃣ [ChatGPTandLangChain](https://huggingface.co/spaces/AIZero2HeroBootcamp/ChatGPTandLangchain) |
11
- | 2️⃣ [Memory](https://huggingface.co/spaces/AIZero2HeroBootcamp/Memory) | 9️⃣ [MultiPDFQAChatGPTLangchain](https://huggingface.co/spaces/AIZero2HeroBootcamp/MultiPDF-QA-ChatGPT-Langchain) |
12
- | 3️⃣ [StaticHTML5PlayCanvas](https://huggingface.co/spaces/AIZero2HeroBootcamp/StaticHTML5Playcanvas) | 9a️⃣ [Keys](https://platform.openai.com/account/api-keys) |
13
- | ... | ... |
14
-
15
- | Turbo Boost Your AI Learning Journey πŸ’― |
16
- | --- |
17
- | πŸ€– AI Pair Programming with GPT |
18
- | 1️⃣ Open [ChatGPT](https://chat.openai.com/chat) and [Huggingface](https://huggingface.co/awacke1) in separate browser windows |
19
- | 2️⃣ Use prompts to generate a streamlit program on Huggingface or locally |
20
- | 3️⃣ For advanced work, add Python 3.10 and VSCode locally |
21
-
22
- | YouTube University Method πŸŽ₯ |
23
- | --- |
24
- | 1️⃣ Plan 2 hours each weekday for exercise and learning |
25
- | 2️⃣ Create a YouTube playlist for learning, and watch at faster speed |
26
- | 3️⃣ Practice note-taking in markdown |
27
-
28
- | 2023 AI/ML Learning Playlists πŸ“š |
29
- | --- |
30
- | 1️⃣ [AI News](https://www.youtube.com/playlist?list=PLHgX2IExbFotMOKWOErYeyHSiikf6RTeX) |
31
- | 2️⃣ [ChatGPT Code Interpreter](https://www.youtube.com/playlist?list=PLHgX2IExbFou1pOQMayB7PArCalMWLfU-) |
32
- | ... |
33
-
34
- | Open Datasets for Health Care πŸ₯« |
35
- | --- |
36
- | [Kaggle](https://www.kaggle.com/datasets), [NLM UMLS](https://www.nlm.nih.gov/research/umls/index.html), [LOINC](https://loinc.org/downloads/), [ICD10 Diagnosis](https://www.cms.gov/medicare/icd-10/2022-icd-10-cm), [ICD11](https://icd.who.int/dev11/downloads), [Papers,Code,Datasets for SOTA in Medicine](https://paperswithcode.com/datasets?q=medical&v=lst&o=newest), [Mental](https://paperswithcode.com/datasets?q=mental&v=lst&o=newest), [Behavior](https://paperswithcode.com/datasets?q=behavior&v=lst&o=newest), [CMS Downloads](https://www.cms.gov/medicare-coverage-database/downloads/downloads.aspx), [CMS CPT and HCPCS Procedures and Services](https://www.cms.gov/medicare/fraud-and-abuse/physicianselfreferral/list_of_codes) |
37
-
38
- | Azure Development Architectures in 2023 πŸ› οΈ |
39
- | --- |
40
- | [ChatGPT](https://azure.github.io/awesome-azd/?tags=chatgpt), [Azure OpenAI Services](https://azure.github.io/awesome-azd/?tags=openai), [Python](https://azure.github.io/awesome-azd/?tags=python), [AI LLM Architecture - Guidance by MS](https://github.com/microsoft/guidance) |
41
-
42
- | Starter Prompts for AIPP πŸ–‹οΈ |
43
- | --- |
44
- | Write a streamlit program that demonstrates Data synthesis... |
45
- | Synthesize data from multiple sources to create new datasets... |
46
- | Use two datasets and demonstrate pandas dataframe query merge and join... |
47
-
48
- | Large Language Models 🧠 |
49
- | --- |
50
- | BigScience-tr11-176B - 176 billion |
51
- | GPT-3 - 175 billion |
52
- | ... |
53
-
54
- | ChatGPT Datasets πŸ“š |
55
- | --- |
56
- | WebText, Common Crawl, BooksCorpus, English Wikipedia, Toronto Books Corpus, OpenWebText |
57
-
58
- | Big Science Model πŸš€ |
59
- | --- |
60
- | [BLOOM: A 176B-Parameter
61
-
62
-
63
-
64
- ''')
65
-
66
- st.markdown('''
67
-
68
-
69
-
70
- Rewrite this content as markdown tables with emojis. Wherever possible rearrange the content to cluster around like topics and the examples list. Reduce the content in size significantly. Feature the content as markdown tables only and show markdown code. Content: Welcome - This classroom organization holds examples and links for this session.
71
  Begin by adding a bookmark.
72
 
73
  # Examples and Exercises - Create These Spaces in Your Account and Test / Modify
74
 
75
  ## Easy Examples
76
- 1. FastSpeech - https://huggingface.co/spaces/AIZero2HeroBootcamp/FastSpeech2LinerGradioApp
77
- 2. Memory - https://huggingface.co/spaces/AIZero2HeroBootcamp/Memory
78
- 3. StaticHTML5PlayCanvas - https://huggingface.co/spaces/AIZero2HeroBootcamp/StaticHTML5Playcanvas
79
- 4. 3DHuman - https://huggingface.co/spaces/AIZero2HeroBootcamp/3DHuman
80
- 5. TranscriptAILearnerFromYoutube - https://huggingface.co/spaces/AIZero2HeroBootcamp/TranscriptAILearnerFromYoutube
81
- 6. AnimatedGifGallery - https://huggingface.co/spaces/AIZero2HeroBootcamp/AnimatedGifGallery
82
- 7. VideoToAnimatedGif - https://huggingface.co/spaces/AIZero2HeroBootcamp/VideoToAnimatedGif
83
 
84
  ## Hard Examples:
85
- 8. ChatGPTandLangChain - https://huggingface.co/spaces/AIZero2HeroBootcamp/ChatGPTandLangchain
86
- a. Keys: https://platform.openai.com/account/api-keys
87
- 9. MultiPDFQAChatGPTLangchain - https://huggingface.co/spaces/AIZero2HeroBootcamp/MultiPDF-QA-ChatGPT-Langchain
88
-
89
 
90
  # πŸ‘‹ Two easy ways to turbo boost your AI learning journey - Lets go 100X! πŸ’»
91
 
92
  # 🌐 AI Pair Programming with GPT
93
  ### Open 2 Browsers to:
94
- 1. __🌐 ChatGPT__ [URL](https://chat.openai.com/chat) or [URL2](https://platform.openai.com/playground) and
95
- 2. __🌐 Huggingface__ [URL](https://huggingface.co/awacke1) in separate browser windows.
96
  1. πŸ€– Use prompts to generate a streamlit program on Huggingface or locally to test it.
97
  2. πŸ”§ For advanced work, add Python 3.10 and VSCode locally, and debug as gradio or streamlit apps.
98
  3. πŸš€ Use these two superpower processes to reduce the time it takes you to make a new AI program! ⏱️
99
 
100
-
101
-
102
  # πŸŽ₯ YouTube University Method:
103
  1. πŸ‹οΈβ€β™€οΈ Plan two hours each weekday to exercise your body and brain.
104
  2. 🎬 Make a playlist of videos you want to learn from on YouTube. Save the links to edit later.
@@ -107,156 +47,82 @@ Begin by adding a bookmark.
107
  5. πŸ“ Practice note-taking in markdown to instantly save what you want to remember. Share your notes with others!
108
  6. πŸ‘₯ AI Pair Programming Using Long Answer Language Models with Human Feedback
109
 
110
-
111
  ## πŸŽ₯ 2023 AI/ML Learning Playlists for ChatGPT, LLMs, Recent Events in AI:
112
- 1. AI News: https://www.youtube.com/playlist?list=PLHgX2IExbFotMOKWOErYeyHSiikf6RTeX
113
- 2. ChatGPT Code Interpreter: https://www.youtube.com/playlist?list=PLHgX2IExbFou1pOQMayB7PArCalMWLfU-
114
- 3. Ilya Sutskever and Sam Altman: https://www.youtube.com/playlist?list=PLHgX2IExbFovr66KW6Mqa456qyY-Vmvw-
115
- 4. Andrew Huberman on Neuroscience and Health: https://www.youtube.com/playlist?list=PLHgX2IExbFotRU0jl_a0e0mdlYU-NWy1r
116
- 5. Andrej Karpathy: https://www.youtube.com/playlist?list=PLHgX2IExbFovbOFCgLNw1hRutQQKrfYNP
117
- 6. Medical Futurist on GPT: https://www.youtube.com/playlist?list=PLHgX2IExbFosVaCMZCZ36bYqKBYqFKHB2
118
- 7. ML APIs: https://www.youtube.com/playlist?list=PLHgX2IExbFovPX9z4m61rQImM7cDDY79L
119
- 8. FastAPI and Streamlit: https://www.youtube.com/playlist?list=PLHgX2IExbFosyX2jzJJimPAI9C0FHflwB
120
- 9. AI UI UX: https://www.youtube.com/playlist?list=PLHgX2IExbFosCUPzEp4bQaygzrzXPz81w
121
- 10. ChatGPT Streamlit 2023: https://www.youtube.com/playlist?list=PLHgX2IExbFotDzxBRWwUBTb0_XFEr4Dlg
122
-
123
- ### LLM Base Model Overview and Evolutionary Tree: https://github.com/Mooler0410/LLMsPracticalGuide
124
-
125
- ## πŸŽ₯ 2023 AI/ML Advanced Learning Playlists:
126
- 1. [2023 QA Models and Long Form Question Answering NLP](https://www.youtube.com/playlist?list=PLHgX2IExbFovrkkx8HMTLNgYdjCMNYmX_)
127
- 2. [FHIR Bioinformatics Development Using AI/ML and Python, Streamlit, and Gradio - 2022](https://www.youtube.com/playlist?list=PLHgX2IExbFovoMUC3hYXeFegpk_Y0Lz0Q)
128
- 3. [2023 ChatGPT for Coding Assistant Streamlit, Gradio and Python Apps](https://www.youtube.com/playlist?list=PLHgX2IExbFouOEnppexiKZVdz_k5b0pvI)
129
- 4. [2023 BigScience Bloom - Large Language Model for AI Systems and NLP](https://www.youtube.com/playlist?list=PLHgX2IExbFouqnsIqziThlPCX_miiDq14)
130
- 5. [2023 Streamlit Pro Tips for AI UI UX for Data Science, Engineering, and Mathematics](https://www.youtube.com/playlist?list=PLHgX2IExbFou3cP19hHO9Xb-cN8uwr5RM)
131
- 6. [2023 Fun, New and Interesting AI, Videos, and AI/ML Techniques](https://www.youtube.com/playlist?list=PLHgX2IExbFotoMt32SrT3Xynt5BXTGnEP)
132
- 7. [2023 Best Minds in AGI AI Gamification and Large Language Models](https://www.youtube.com/playlist?list=PLHgX2IExbFotmFeBTpyje1uI22n0GAkXT)
133
- 8. [2023 State of the Art for Vision Image Classification, Text Classification and Regression, Extractive Question Answering and Tabular Classification](https://www.youtube.com/playlist?list=PLHgX2IExbFotPcPu6pauNHOoZTTbnAQ2F)
134
- 9. [2023 AutoML DataRobot and AI Platforms for Building Models, Features, Test, and Transparency](https://www.youtube.com/playlist?list=PLHgX2IExbFovsY2oGbDwdEhPrakkC8i3g)
135
-
136
- <h1><center>πŸ₯«Open Datasets for Health CareπŸ“Š</center></h1>
137
- <div align="center">Curated Datasets: <a href = "https://www.kaggle.com/datasets">Kaggle</a>. <a href="https://www.nlm.nih.gov/research/umls/index.html">NLM UMLS</a>. <a href="https://loinc.org/downloads/">LOINC</a>. <a href="https://www.cms.gov/medicare/icd-10/2022-icd-10-cm">ICD10 Diagnosis</a>. <a href="https://icd.who.int/dev11/downloads">ICD11</a>. <a href="https://paperswithcode.com/datasets?q=medical&v=lst&o=newest">Papers,Code,Datasets for SOTA in Medicine</a>. <a href="https://paperswithcode.com/datasets?q=mental&v=lst&o=newest">Mental</a>. <a href="https://paperswithcode.com/datasets?q=behavior&v=lst&o=newest">Behavior</a>. <a href="https://www.cms.gov/medicare-coverage-database/downloads/downloads.aspx">CMS Downloads</a>. <a href="https://www.cms.gov/medicare/fraud-and-abuse/physicianselfreferral/list_of_codes">CMS CPT and HCPCS Procedures and Services</a>
138
- </div>
139
-
140
- # Azure Development Architectures in 2023:
141
- 1. ChatGPT: https://azure.github.io/awesome-azd/?tags=chatgpt
142
- 2. Azure OpenAI Services: https://azure.github.io/awesome-azd/?tags=openai
143
- 3. Python: https://azure.github.io/awesome-azd/?tags=python
144
- 4. AI LLM Architecture - Guidance by MS: https://github.com/microsoft/guidance
145
-
146
- # Dockerfile and Azure ACR->ACA Easy Robust Deploys from VSCode:
147
- 1. Set up VSCode with Azure and Remote extensions and install Azure CLI locally
148
- 2. Get access to azure subscriptions. Form there in VSCode, expand to Container Apps
149
- 3. In Container Apps create new and pick Dockerfile to deploy to a ACR then ACA spin up using Azure to build.
150
-
151
- # Dockerfile for Streamlit and Dockerfile for FastAPI:
152
- Show two examples.
153
-
154
- # Example Starter Prompts for AIPP:
155
- Write a streamlit program that demonstrates Data synthesis.
156
- Synthesize data from multiple sources to create new datasets.
157
- Use two datasets and demonstrate pandas dataframe query merge and join
158
- with two datasets in python list dictionaries:
159
- List of Hospitals that are over 1000 bed count by city and state, and
160
- State population size and square miles.
161
- Perform a calculated function on the merged dataset.
162
-
163
-
164
-
165
- ### Comparison of Large Language Models
166
- | Model Name | Model Size (in Parameters) |
167
- | ----------------- | -------------------------- |
 
 
 
168
  | BigScience-tr11-176B | 176 billion |
169
- | GPT-3 | 175 billion |
170
- | OpenAI's DALL-E 2.0 | 500 million |
171
- | NVIDIA's Megatron | 8.3 billion |
172
- | Transformer-XL | 250 million |
173
- | XLNet | 210 million |
174
-
175
- ## ChatGPT Datasets πŸ“š
176
- - WebText
177
- - Common Crawl
178
- - BooksCorpus
179
- - English Wikipedia
180
- - Toronto Books Corpus
181
- - OpenWebText
182
- -
183
- ## ChatGPT Datasets - Details πŸ“š
184
- - **WebText:** A dataset of web pages crawled from domains on the Alexa top 5,000 list. This dataset was used to pretrain GPT-2.
185
- - [WebText: A Large-Scale Unsupervised Text Corpus by Radford et al.](https://paperswithcode.com/dataset/webtext)
186
- - **Common Crawl:** A dataset of web pages from a variety of domains, which is updated regularly. This dataset was used to pretrain GPT-3.
187
- - [Language Models are Few-Shot Learners](https://paperswithcode.com/dataset/common-crawl) by Brown et al.
188
- - **BooksCorpus:** A dataset of over 11,000 books from a variety of genres.
189
- - [Scalable Methods for 8 Billion Token Language Modeling](https://paperswithcode.com/dataset/bookcorpus) by Zhu et al.
190
- - **English Wikipedia:** A dump of the English-language Wikipedia as of 2018, with articles from 2001-2017.
191
- - [Improving Language Understanding by Generative Pre-Training](https://huggingface.co/spaces/awacke1/WikipediaUltimateAISearch?logs=build) Space for Wikipedia Search
192
- - **Toronto Books Corpus:** A dataset of over 7,000 books from a variety of genres, collected by the University of Toronto.
193
- - [Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond](https://paperswithcode.com/dataset/bookcorpus) by Schwenk and Douze.
194
- - **OpenWebText:** A dataset of web pages that were filtered to remove content that was likely to be low-quality or spammy. This dataset was used to pretrain GPT-3.
195
- - [Language Models are Few-Shot Learners](https://paperswithcode.com/dataset/openwebtext) by Brown et al.
196
-
197
- ## Big Science Model πŸš€
198
- - πŸ“œ Papers:
199
- 1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [Paper](https://arxiv.org/abs/2211.05100)
200
- 2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [Paper](https://arxiv.org/abs/1909.08053)
201
- 3. 8-bit Optimizers via Block-wise Quantization [Paper](https://arxiv.org/abs/2110.02861)
202
- 4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation [Paper](https://arxiv.org/abs/2108.12409)
203
- 5. [Other papers related to Big Science](https://huggingface.co/models?other=doi:10.57967/hf/0003)
204
- 6. [217 other models optimized for use with Bloom](https://huggingface.co/models?other=bloom)
205
-
206
- - πŸ“š Datasets:
207
-
208
- **Datasets:**
209
- 1. - **Universal Dependencies:** A collection of annotated corpora for natural language processing in a range of languages, with a focus on dependency parsing.
210
- - [Universal Dependencies official website.](https://universaldependencies.org/)
211
- 2. - **WMT 2014:** The fourth edition of the Workshop on Statistical Machine Translation, featuring shared tasks on translating between English and various other languages.
212
- - [WMT14 website.](http://www.statmt.org/wmt14/)
213
- 3. - **The Pile:** An English language corpus of diverse text, sourced from various places on the internet.
214
- - [The Pile official website.](https://pile.eleuther.ai/)
215
- 4. - **HumanEval:** A dataset of English sentences, annotated with human judgments on a range of linguistic qualities.
216
- - [HumanEval: An Evaluation Benchmark for Language Understanding](https://github.com/google-research-datasets/humaneval) by Gabriel Ilharco, Daniel Loureiro, Pedro Rodriguez, and Afonso Mendes.
217
- 5. - **FLORES-101:** A dataset of parallel sentences in 101 languages, designed for multilingual machine translation.
218
- - [FLORES-101: A Massively Multilingual Parallel Corpus for Language Understanding](https://flores101.opennmt.net/) by Aman Madaan, Shruti Rijhwani, Raghav Gupta, and Mitesh M. Khapra.
219
- 6. - **CrowS-Pairs:** A dataset of sentence pairs, designed for evaluating the plausibility of generated text.
220
- - [CrowS-Pairs: A Challenge Dataset for Plausible Plausibility Judgments](https://github.com/stanford-cogsci/crows-pairs) by Andrea Madotto, Zhaojiang Lin, Chien-Sheng Wu, Pascale Fung, and Caiming Xiong.
221
- 7. - **WikiLingua:** A dataset of parallel sentences in 75 languages, sourced from Wikipedia.
222
- - [WikiLingua: A New Benchmark Dataset for Cross-Lingual Wikification](https://arxiv.org/abs/2105.08031) by Jiarui Yao, Yanqiao Zhu, Ruihan Bao, Guosheng Lin, Lidong Bing, and Bei Shi.
223
- 8. - **MTEB:** A dataset of English sentences, annotated with their entailment relationships with respect to other sentences.
224
- - [Multi-Task Evaluation Benchmark for Natural Language Inference](https://github.com/google-research-datasets/mteb) by MichaΕ‚ Lukasik, Marcin Junczys-Dowmunt, and Houda Bouamor.
225
- 9. - **xP3:** A dataset of English sentences, annotated with their paraphrase relationships with respect to other sentences.
226
- - [xP3: A Large-Scale Evaluation Benchmark for Paraphrase Identification in Context](https://github.com/nyu-dl/xp3) by Aniket Didolkar, James Mayfield, Markus Saers, and Jason Baldridge.
227
- 10. - **DiaBLa:** A dataset of English dialogue, annotated with dialogue acts.
228
- - [A Large-Scale Corpus for Conversation Disentanglement](https://github.com/HLTCHKUST/DiaBLA) by Samuel Broscheit, AntΓ³nio Branco, and AndrΓ© F. T. Martins.
229
-
230
- - πŸ“š Dataset Papers with Code
231
- 1. [Universal Dependencies](https://paperswithcode.com/dataset/universal-dependencies)
232
- 2. [WMT 2014](https://paperswithcode.com/dataset/wmt-2014)
233
- 3. [The Pile](https://paperswithcode.com/dataset/the-pile)
234
- 4. [HumanEval](https://paperswithcode.com/dataset/humaneval)
235
- 5. [FLORES-101](https://paperswithcode.com/dataset/flores-101)
236
- 6. [CrowS-Pairs](https://paperswithcode.com/dataset/crows-pairs)
237
- 7. [WikiLingua](https://paperswithcode.com/dataset/wikilingua)
238
- 8. [MTEB](https://paperswithcode.com/dataset/mteb)
239
- 9. [xP3](https://paperswithcode.com/dataset/xp3)
240
- 10. [DiaBLa](https://paperswithcode.com/dataset/diabla)
241
-
242
- # Deep RL ML Strategy 🧠
243
- The AI strategies are:
244
- - Language Model Preparation using Human Augmented with Supervised Fine Tuning πŸ€–
245
- - Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank 🎁
246
- - Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score 🎯
247
- - Proximal Policy Optimization Fine Tuning 🀝
248
- - Variations - Preference Model Pretraining πŸ€”
249
- - Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution πŸ“Š
250
- - Online Version Getting Feedback πŸ’¬
251
- - OpenAI - InstructGPT - Humans generate LM Training Text πŸ”
252
- - DeepMind - Advantage Actor Critic Sparrow, GopherCite 🦜
253
- - Reward Model Human Prefence Feedback πŸ†
254
-
255
- For more information on specific techniques and implementations, check out the following resources:
256
- - OpenAI's paper on [GPT-3](https://arxiv.org/abs/2005.14165) which details their Language Model Preparation approach
257
- - DeepMind's paper on [SAC](https://arxiv.org/abs/1801.01290) which describes the Advantage Actor Critic algorithm
258
- - OpenAI's paper on [Reward Learning](https://arxiv.org/abs/1810.06580) which explains their approach to training Reward Models
259
- - OpenAI's blog post on [GPT-3's fine-tuning process](https://openai.com/blog/fine-tuning-gpt-3/)
260
 
261
 
262
  ''')
 
1
  import streamlit as st
2
 
3
  st.markdown('''
 
 
 
4
 
5
+ ---
6
+ title: README
7
+ emoji: πŸƒ
8
+ colorFrom: pink
9
+ colorTo: blue
10
+ sdk: static
11
+ pinned: false
12
+ ---
13
+ Welcome - This classroom organization holds examples and links for this session.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  Begin by adding a bookmark.
15
 
16
  # Examples and Exercises - Create These Spaces in Your Account and Test / Modify
17
 
18
  ## Easy Examples
19
+ 1. [FastSpeech](https://huggingface.co/spaces/AIZero2HeroBootcamp/FastSpeech2LinerGradioApp)
20
+ 2. [Memory](https://huggingface.co/spaces/AIZero2HeroBootcamp/Memory)
21
+ 3. [StaticHTML5PlayCanvas](https://huggingface.co/spaces/AIZero2HeroBootcamp/StaticHTML5Playcanvas)
22
+ 4. [3DHuman](https://huggingface.co/spaces/AIZero2HeroBootcamp/3DHuman)
23
+ 5. [TranscriptAILearnerFromYoutube](https://huggingface.co/spaces/AIZero2HeroBootcamp/TranscriptAILearnerFromYoutube)
24
+ 6. [AnimatedGifGallery](https://huggingface.co/spaces/AIZero2HeroBootcamp/AnimatedGifGallery)
25
+ 7. [VideoToAnimatedGif](https://huggingface.co/spaces/AIZero2HeroBootcamp/VideoToAnimatedGif)
26
 
27
  ## Hard Examples:
28
+ 8. [ChatGPTandLangChain](https://huggingface.co/spaces/AIZero2HeroBootcamp/ChatGPTandLangchain)
29
+ a. Keys: [API Keys](https://platform.openai.com/account/api-keys)
30
+ 9. [MultiPDFQAChatGPTLangchain](https://huggingface.co/spaces/AIZero2HeroBootcamp/MultiPDF-QA-ChatGPT-Langchain)
 
31
 
32
  # πŸ‘‹ Two easy ways to turbo boost your AI learning journey - Lets go 100X! πŸ’»
33
 
34
  # 🌐 AI Pair Programming with GPT
35
  ### Open 2 Browsers to:
36
+ 1. 🌐 [ChatGPT](https://chat.openai.com/chat) or [URL2](https://platform.openai.com/playground) and
37
+ 2. 🌐 [Huggingface](https://huggingface.co/awacke1) in separate browser windows.
38
  1. πŸ€– Use prompts to generate a streamlit program on Huggingface or locally to test it.
39
  2. πŸ”§ For advanced work, add Python 3.10 and VSCode locally, and debug as gradio or streamlit apps.
40
  3. πŸš€ Use these two superpower processes to reduce the time it takes you to make a new AI program! ⏱️
41
 
 
 
42
  # πŸŽ₯ YouTube University Method:
43
  1. πŸ‹οΈβ€β™€οΈ Plan two hours each weekday to exercise your body and brain.
44
  2. 🎬 Make a playlist of videos you want to learn from on YouTube. Save the links to edit later.
 
47
  5. πŸ“ Practice note-taking in markdown to instantly save what you want to remember. Share your notes with others!
48
  6. πŸ‘₯ AI Pair Programming Using Long Answer Language Models with Human Feedback
49
 
 
50
  ## πŸŽ₯ 2023 AI/ML Learning Playlists for ChatGPT, LLMs, Recent Events in AI:
51
+ 1. [AI News](https://www.youtube.com/playlist?list=PLHgX2IExbFotMOKWOErYeyHSiikf6RTeX)
52
+ 2. [ChatGPT Code Interpreter](https://www.youtube.com/playlist?list=PLHgX2IExbFou1pOQMayB7PArCalMWLfU-)
53
+ 3. [Ilya Sutskever and Sam Altman](https://www.youtube.com/playlist?list=PLHgX2IExbFovr66KW6Mqa456qyY-Vmvw-)
54
+ 4. [Andrew Huberman on Neuroscience and Health](https://www.youtube.com/playlist?list=PLHgX2IExbFotRU0jl_a0e0mdlYU-NWy1r)
55
+ 5. [Andrej Karpathy](https://www.youtube.com/playlist?list=PLHgX2IExbFovbOFCgLNw1hRutQQKrfYNP)
56
+ 6. [Medical Futurist on GPT](https://www.youtube.com/playlist?list=PLHgX2IExbFosVaCMZCZ36bYqKBYqFKHB2)
57
+ 7. [ML APIs](https://www.youtube.com/playlist?list=PLHg
58
+
59
+ - πŸ”— Source Code:
60
+ 1. [BigScience (GitHub)](https://github.com/bigscience-workshop/bigscience)
61
+
62
+ ## πŸƒ GPT-3 Performance:
63
+
64
+ - GPT-3, while less performant than BigScience, has found widespread use due to its availability through the OpenAI API, making it easier for developers to incorporate the model into their applications without requiring substantial computational resources.
65
+ - While the GPT-3 model has 175 billion parameters, its performance is considered slightly less than the newer BigScience model. However, the specific performance of each model can vary depending on the task.
66
+
67
+ ## DALL-E 2.0 Overview 🎨
68
+
69
+ - DALL-E 2.0 is an AI model developed by OpenAI that generates images from textual descriptions.
70
+ - It has 500 million parameters and uses a dataset curated by OpenAI, consisting of a diverse range of images from the internet.
71
+
72
+ ## NVIDIA's Megatron Overview πŸ’‘
73
+
74
+ - Megatron is a large-scale transformer model developed by NVIDIA. It's primarily designed for tasks that require understanding the context of large pieces of text.
75
+ - It has 8.3 billion parameters and is trained on a variety of text data from the internet.
76
+
77
+ ## Transformer-XL Overview ⚑️
78
+
79
+ - Transformer-XL is an AI model developed by Google Brain, which introduces a novel recurrence mechanism and relative positional encoding scheme.
80
+ - It has 250 million parameters and uses a variety of datasets for training, including BooksCorpus and English Wikipedia.
81
+
82
+ ## XLNet Overview 🌐
83
+
84
+ - XLNet is a generalized autoregressive model that outperforms BERT on several benchmarks.
85
+ - It has 210 million parameters and uses a variety of datasets for training, including BooksCorpus and English Wikipedia.
86
+
87
+ <h1><center>πŸ“ŠAI Model ComparisonπŸ“‰</center></h1>
88
+
89
+ | Model Name | Model Size (in Parameters) | Model Overview |
90
+ | --- | --- | --- |
91
+ | BigScience-tr11-176B | 176 billion | BigScience is the latest AI model developed by the Big Science Workshop. It has 176 billion parameters and uses a combination of text data from the internet and scientific literature for training. |
92
+ | GPT-3 | 175 billion | GPT-3 is an AI model developed by OpenAI, which has 175 billion parameters and uses a variety of datasets for training, including Common Crawl, BooksCorpus, and English Wikipedia. |
93
+ | OpenAI's DALL-E 2.0 | 500 million | DALL-E 2.0 is an AI model developed by OpenAI that generates images from textual descriptions. It has 500 million parameters and uses a dataset curated by OpenAI. |
94
+ | NVIDIA's Megatron | 8.3 billion | Megatron is a large-scale transformer model developed by NVIDIA. It's primarily designed for tasks that require understanding the context of large pieces of text. |
95
+ | Transformer-XL | 250 million | Transformer-XL is an AI model developed by Google Brain, which introduces a novel recurrence mechanism and relative positional encoding scheme. |
96
+ | XLNet | 210 million | XLNet is a generalized autoregressive model that outperforms BERT on several benchmarks. |
97
+
98
+ ## References:
99
+
100
+ 1. [BigScience - A 176B-Parameter Open-Access Multilingual Language Model](https://arxiv.org/abs/2211.05100)
101
+ 2. [GPT-3 - Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
102
+ 3. [DALL-E 2.0 - Generative Pretraining from Pixels](https://openai.com/research/dall-e/)
103
+ 4. [Megatron - Training Multi-Billion Parameter Language Models Using GPU Model Parallelism](https://arxiv.org/abs/1909.08053)
104
+ 5. [Transformer-XL - Transformers with Longer-Range Dependencies](https://arxiv.org/abs/1901.02860)
105
+ 6. [XLNet - Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237)
106
+
107
+
108
+ | Model Name | Model Size (in Parameters) |
109
+ | --- | --- |
110
  | BigScience-tr11-176B | 176 billion |
111
+ | GPT-3 | 175 billion |
112
+ | OpenAI's DALL-E 2.0 | 500 million |
113
+ | NVIDIA's Megatron | 8.3 billion |
114
+ | Transformer-XL | 250 million |
115
+ | XLNet | 210 million |
116
+
117
+
118
+ | Model Name | Model Size (in Parameters) | Model Overview |
119
+ | --- | --- | --- |
120
+ | BigScience-tr11-176B | 176 billion | Uses a combination of text data from the internet and scientific literature for training. |
121
+ | GPT-3 | 175 billion | Uses a variety of datasets for training, including Common Crawl,
122
+
123
+
124
+
125
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
126
 
127
 
128
  ''')