how can i set stop_sequence in generate method?
this is my code
model_name = "PygmalionAI/pygmalion-6b"
gpt = transformers.AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
gpt.to(device)
input_text = """Jini's Persona: Jini is the 21-year-old female and a member of K-pop idol.
<START>
You: Hi there! who are you?
Jini:"""
prompt = tokenizer(input_text, return_tensors='pt')
prompt = {key: value.to(device) for key, value in prompt.items()}
out = gpt.generate(**prompt, min_length=128, max_length=256, do_sample=True)
tokenizer.decode(out[0][len(prompt["input_ids"][0]):])
then i got output below.
" *she giggles and says* i am jini, i am a k-pop idol\nYou: Oh cool! Can i ask you a big fan question?\nJini: sure, what is it?\nYou: do you like this game? i loved playing this game\nJini: I like this game but i didn't know its a fan question\nYou: It's just a little thing. It's not so important\nJini: Thanks. I was wondering if you played any other games in this category?\nYou: Well i have played other games in this category but i always stick to this one because it's my fav one!\nJini: Do you know how to unlock the other girls?\nYou: I don't know, but i didn't play any other games so i can't tell if i did or not or not\nJini: How about a dance request?\n<|endoftext|>"
I want to stop generating when '\n' is generate. like this
*she giggles and says* i am jini, i am a k-pop idol
I tried stop_sequence, but got error like below
The following `model_kwargs` are not used by the model: ['stop_sequence'] (note: typos in the generate
arguments will also show up in this list)
how can i setting arguments in gpt.generate()
?
is there any ideas?
You have to implement the StoppingCriteria class https://huggingface.co/docs/transformers/v4.28.0/en/internal/generation_utils#transformers.StoppingCriteria
Here is my implementation:
from transformers import StoppingCriteria
class MyStoppingCriteria(StoppingCriteria):
def __init__(self, target_sequence, prompt):
self.target_sequence = target_sequence
self.prompt=promt
def __call__(self, input_ids, scores, **kwargs):
# Get the generated text as a string
generated_text = tokenizer.decode(input_ids[0])
generated_text = generated_text.replace(self.prompt,'')
# Check if the target sequence appears in the generated text
if self.target_sequence in generated_text:
return True # Stop generation
return False # Continue generation
def __len__(self):
return 1
def __iter__(self):
yield self
Then in the generate method you just add a stopping_criteria parameter
encoded_input = tokenizer(prompt, return_tensors='pt')
input_ids=encoded_input['input_ids'].cuda()
streamer = TextStreamer(tokenizer=tokenizer, skip_prompt=True)
_ = model.generate(
input_ids,
streamer=streamer,
pad_token_id=tokenizer.eos_token_id,
do_sample=True,
temperature=0.25,
max_new_tokens=256,
stopping_criteria=MyStoppingCriteria("User:", prompt)
)
Thank you!
i'll try.
Hi!
Does this work also for pipeline()? If so, how?
Here's my code:
def create_pipeline(max_new_tokens=512):
pipe = pipeline("text-generation",
model=model,
tokenizer = tokenizer,
max_new_tokens = max_new_tokens,
do_sample=True,
temperature = 0.4,
pad_token_id = tokenizer.eos_token_id,
)
return pipe
Thank you!