Failed to deploy on Sagemaker

#2
by arisin - opened

Using the sample Sagemaker code provided I got the following exception when deploying on Sagemaker g5 instance.

#033[0m#033[2m:#033[0m#033[1mwarmup#033[0m#033[2m:#033[0m #033[2mtext_generation_client#033[0m#033[2m:#033[0m #033[2mrouter/client/src/lib.rs#033[0m#033[2m:#033[0m#033[2m33:#033[0m Server error: gemm_half_q_half(): incompatible function arguments. The following argument types are supported:
    1. (arg0: torch.Tensor, arg1: int, arg2: torch.Tensor, arg3: bool) -> None

Please advise what is wrong.
Thank you.

Sign up or log in to comment