A small lm. (Russian only) Created to emulate a really simple one way dialogue; WARNING!!! CAN SWEAR! It was trained on two T4s from scratch. Final training time: 1 hour 2 minutes. The model consists of 3 transformer blocks stacked forming 6 layers.
A small lm. (Russian only) Created to emulate a really simple one way dialogue; WARNING!!! CAN SWEAR! It was trained on two T4s from scratch. Final training time: 1 hour 2 minutes. The model consists of 3 transformer blocks stacked forming 6 layers.