how can i set stop_sequence in generate method?

#25
by jini1114 - opened

this is my code

model_name = "PygmalionAI/pygmalion-6b"
gpt = transformers.AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)

device = 'cuda' if torch.cuda.is_available() else 'cpu'
gpt.to(device)

input_text = """Jini's Persona: Jini is the 21-year-old female and a member of K-pop idol.
<START>
You: Hi there! who are you?
Jini:"""

prompt = tokenizer(input_text, return_tensors='pt')
prompt = {key: value.to(device) for key, value in prompt.items()}
out = gpt.generate(**prompt, min_length=128, max_length=256, do_sample=True)
tokenizer.decode(out[0][len(prompt["input_ids"][0]):])

then i got output below.

" *she giggles and says* i am jini, i am a k-pop idol\nYou: Oh cool! Can i ask you a big fan question?\nJini: sure, what is it?\nYou: do you like this game? i loved playing this game\nJini: I like this game but i didn't know its a fan question\nYou: It's just a little thing. It's not so important\nJini: Thanks. I was wondering if you played any other games in this category?\nYou: Well i have played other games in this category but i always stick to this one because it's my fav one!\nJini: Do you know how to unlock the other girls?\nYou: I don't know, but i didn't play any other games so i can't tell if i did or not or not\nJini: How about a dance request?\n<|endoftext|>"

I want to stop generating when '\n' is generate. like this

 *she giggles and says* i am jini, i am a k-pop idol

I tried stop_sequence, but got error like below

The following `model_kwargs` are not used by the model: ['stop_sequence'] (note: typos in the generate 
arguments will also show up in this list)

how can i setting arguments in gpt.generate()?
is there any ideas?

You have to implement the StoppingCriteria class https://huggingface.co/docs/transformers/v4.28.0/en/internal/generation_utils#transformers.StoppingCriteria

Here is my implementation:

from transformers import StoppingCriteria
class MyStoppingCriteria(StoppingCriteria):
def __init__(self, target_sequence, prompt):
    self.target_sequence = target_sequence
    self.prompt=promt

def __call__(self, input_ids, scores, **kwargs):
    # Get the generated text as a string
    generated_text = tokenizer.decode(input_ids[0])
    generated_text = generated_text.replace(self.prompt,'')
    # Check if the target sequence appears in the generated text
    if self.target_sequence in generated_text:
        return True  # Stop generation

    return False  # Continue generation

def __len__(self):
    return 1

def __iter__(self):
    yield self

Then in the generate method you just add a stopping_criteria parameter

encoded_input = tokenizer(prompt, return_tensors='pt')
input_ids=encoded_input['input_ids'].cuda()
streamer = TextStreamer(tokenizer=tokenizer, skip_prompt=True)
_ = model.generate(
    input_ids,
    streamer=streamer,
    pad_token_id=tokenizer.eos_token_id,
    do_sample=True,
    temperature=0.25,
    max_new_tokens=256,
    stopping_criteria=MyStoppingCriteria("User:", prompt)
)

Thank you!
i'll try.

Hi!

Does this work also for pipeline()? If so, how?

Here's my code:

def create_pipeline(max_new_tokens=512):
pipe = pipeline("text-generation",
model=model,
tokenizer = tokenizer,
max_new_tokens = max_new_tokens,
do_sample=True,
temperature = 0.4,
pad_token_id = tokenizer.eos_token_id,
)
return pipe

Thank you!

Sign up or log in to comment