Text Generation
Transformers
ONNX
English
gpt_neox

So apparently this is CPU only the way you have it set up.

#1
by tmaggenti - opened

I could not believe how long this took to answer a simple question. I finally realized it was using my CPU, not my GPU. I have been running the regular Dolly version 2 7b, which runs pretty fast, taking a couple of seconds to answer questions. I thought this would be even faster... Ha, Ha, Ha, not so much.

You need to update this card to include GPU so people do not have to wait a week for an answer to simple questions!

Sign up or log in to comment