我使用GPT2LMHeadModel来改变GPT2选择句子中下一个单词的方式。此时,我必须给出句子的开头部分,GTP2开始预测更好的下一个单词。
我想让GPT2读一个完整的句子,然后在此基础上开始一个新的句子(就像翻译一样)
这是我如何使用它的一个例子:
def gera_palavras_candidatas(context, past):
#global model
global enc
global stop_token
if past == None:
context = torch.tensor(context).unsqueeze(0)
else:
context = torch.tensor([context[-1]]).unsqueeze(0)
context = {'input_ids': context}
output = context
prev = context
with torch.no_grad():
logits, past = model(**prev, past_key_values=past, use_cache=True, return_dict=False)
logits = logits[:, -1, :]
probs = F.softmax(logits, dim=-1).tolist()[0]
probs = sorted(enumerate(probs), key=lambda x: x[1], reverse=True)
return probs, past
stop_token = [enc.encoder[x] for x in ('<|endoftext|>', '.', '!', '?')]
initial_sentence= "What I am trying to say is"
context = enc.encode(initial_sentence)
candidate_words, past = generate_candidates(context, None)
print('Candidate words to complete the sentence "', initial_sentence, '": ')
print('Word Probability Score')
for i in range(0, 10):
candidate_word = candidate_words[i]
finalWord = enc.decode(candidate_word[0])
count = zipf_frequency (finalWord, 'en',wordlist='large')
print("%-15s" % finalWord , candidate_word[1], str(count))
为了让GPT2从零开始一个句子,而不是完成一个初始句子,我需要设置什么样的参数?