Crystalcareai
/

Quiet-Star-Custom

Text Generation

Model card Files Files and versions Community

Crystalcareai commited on Apr 5, 2024

Commit

4eaaef0

·

verified ·

1 Parent(s): 7d4d670

Update modeling_quiet.py

Files changed (1) hide show

modeling_quiet.py +1 -1

modeling_quiet.py CHANGED Viewed

@@ -774,7 +774,7 @@ class QuietSdpaAttention(QuietAttention):
 				raise ValueError(
 					f"Attention mask should be of size {(bsz, 1, q_len, kv_seq_len)}, but is {attention_mask.size()}"
 				)
 		# SDPA with memory-efficient backend is currently (torch==2.1.2) bugged with non-contiguous inputs with custom attn_mask,
 		# Reference: https://github.com/pytorch/pytorch/issues/112577.
 		if query_states.device.type == "cuda" and attention_mask is not None:

 				raise ValueError(
 					f"Attention mask should be of size {(bsz, 1, q_len, kv_seq_len)}, but is {attention_mask.size()}"
 				)
+			attention_mask = attention_mask.to(query_states.dtype)
 		# SDPA with memory-efficient backend is currently (torch==2.1.2) bugged with non-contiguous inputs with custom attn_mask,
 		# Reference: https://github.com/pytorch/pytorch/issues/112577.
 		if query_states.device.type == "cuda" and attention_mask is not None: