About Temporal Positional Encoding

#4
by ricar0 - opened

Is there an error in this code?
Maybe we should change pos_angle[:, 1::2] = torch.cos(pos_angle[:, 0::2]) to pos_angle[:, 1::2] = torch.cos(pos_angle[:, 1::2])?

 def get_angle(self, position):
        pos_angle = self.angle.reshape(1, -1).to(position.device) * position.reshape(-1, 1)
        pos_angle[:, 0::2] = torch.sin(pos_angle[:, 0::2])
        pos_angle[:, 1::2] = torch.cos(pos_angle[:, 0::2])
        pos_angle = pos_angle.unsqueeze(1)
        return pos_angle
ricar0 changed discussion status to closed
ricar0 changed discussion status to open
KangarooGroup org
edited Sep 19, 2024

This is a typo. While the result of ‘pos_angle’ is correct because the angles of the odd and even positions of hidden_dim are equal.

modeling_kangaroo.py#L1080
self.angle = torch.stack([1 / torch.pow(torch.tensor(10000), torch.tensor(2 * (hid_j // 2) / hidden_dim)) for hid_j in range(hidden_dim)])

This is a typo. While the result of ‘pos_angle’ is correct because the angles of the odd and even positions of hidden_dim are equal.

modeling_kangaroo.py#L1080
self.angle = torch.stack([1 / torch.pow(torch.tensor(10000), torch.tensor(2 * (hid_j // 2) / hidden_dim)) for hid_j in range(hidden_dim)])

I know that. But When calculating the pos_angle[:, 1::2], pos_angle[:, 0::2]has already changed in pos_angle[:, 0::2] = torch.sin(pos_angle[:, 0::2])

Sign up or log in to comment