Update README.md
Browse files
README.md
CHANGED
@@ -11,6 +11,7 @@ tags:
|
|
11 |
license: cc-by-4.0
|
12 |
|
13 |
---
|
|
|
14 |
|
15 |
<style>
|
16 |
img {
|
@@ -21,8 +22,6 @@ img {
|
|
21 |
|[![Model architecture](https://img.shields.io/badge/Model%20Arch-Transformer%20Decoder-green)](#model-architecture)|[![Model size](https://img.shields.io/badge/Params-1.3B-green)](#model-architecture)|[![Language](https://img.shields.io/badge/Language-en--US-lightgrey#model-badge)](#datasets)
|
22 |
|
23 |
|
24 |
-
# Megatron-GPT 1.3B
|
25 |
-
|
26 |
## Model Description
|
27 |
|
28 |
Megatron-GPT 1.3B is a transformer-based language model. GPT refers to a class of transformer decoder-only models similar to GPT-2 and 3 while 1.3B refers to the total trainable parameter count (1.3 Billion) [1, 2].
|
|
|
11 |
license: cc-by-4.0
|
12 |
|
13 |
---
|
14 |
+
# Megatron-GPT 1.3B
|
15 |
|
16 |
<style>
|
17 |
img {
|
|
|
22 |
|[![Model architecture](https://img.shields.io/badge/Model%20Arch-Transformer%20Decoder-green)](#model-architecture)|[![Model size](https://img.shields.io/badge/Params-1.3B-green)](#model-architecture)|[![Language](https://img.shields.io/badge/Language-en--US-lightgrey#model-badge)](#datasets)
|
23 |
|
24 |
|
|
|
|
|
25 |
## Model Description
|
26 |
|
27 |
Megatron-GPT 1.3B is a transformer-based language model. GPT refers to a class of transformer decoder-only models similar to GPT-2 and 3 while 1.3B refers to the total trainable parameter count (1.3 Billion) [1, 2].
|