How was this model trained?
#1
by
BramVanroy
- opened
I'd love to play around with a smaller version of mbart locally for debugging, so this tiny mbart sounds promising! Can you give more details about how this was trained/distilled? Data used, hyperparameters, etc.
Thanks!
at least looking at the demo, it doesn't seem promising
I think I read somewhere that the model was just randomly initialized and not trained at all, but I do not remember whether this occurred to me in a dream or real life.