Failed to deploy on Sagemaker
#2
by
arisin
- opened
Using the sample Sagemaker code provided I got the following exception when deploying on Sagemaker g5 instance.
#033[0m#033[2m:#033[0m#033[1mwarmup#033[0m#033[2m:#033[0m #033[2mtext_generation_client#033[0m#033[2m:#033[0m #033[2mrouter/client/src/lib.rs#033[0m#033[2m:#033[0m#033[2m33:#033[0m Server error: gemm_half_q_half(): incompatible function arguments. The following argument types are supported:
1. (arg0: torch.Tensor, arg1: int, arg2: torch.Tensor, arg3: bool) -> None
Please advise what is wrong.
Thank you.