TY - JOUR
T1 - Patent claim generation by fine-tuning OpenAI GPT-2
AU - Lee, Jieh-Sheng
AU - Hsiang, Jieh
N1 - Publisher Copyright:
© 2020 Elsevier Ltd
PY - 2020/9
Y1 - 2020/9
N2 - In this work, we focus on fine-tuning an OpenAI GPT-2 pre-trained model for generating patent claims. GPT-2 has demonstrated impressive efficacy of pre-trained language models on various tasks, particularly coherent text generation. Patent claim language itself has rarely been explored in the past and poses a unique challenge. We are motivated to generate coherent patent claims automatically so that augmented inventing might be viable someday. In our implementation, we identified a unique language structure in patent claims and leveraged its implicit human annotations. We investigated the fine-tuning process by probing the first 100 steps and observing the generated text at each step. Based on both conditional and unconditional random sampling, we analyze the overall quality of generated patent claims. Our contributions include: (1) being the first to generate patent claims by machines and being the first to apply GPT-2 to patent claim generation, (2) providing various experiment results for qualitative analysis and future research, (3) proposing a new sampling approach for text generation, and (4) publishing our fine-tuned GPT-2 model and sample code for future researchers to run on Colab.
AB - In this work, we focus on fine-tuning an OpenAI GPT-2 pre-trained model for generating patent claims. GPT-2 has demonstrated impressive efficacy of pre-trained language models on various tasks, particularly coherent text generation. Patent claim language itself has rarely been explored in the past and poses a unique challenge. We are motivated to generate coherent patent claims automatically so that augmented inventing might be viable someday. In our implementation, we identified a unique language structure in patent claims and leveraged its implicit human annotations. We investigated the fine-tuning process by probing the first 100 steps and observing the generated text at each step. Based on both conditional and unconditional random sampling, we analyze the overall quality of generated patent claims. Our contributions include: (1) being the first to generate patent claims by machines and being the first to apply GPT-2 to patent claim generation, (2) providing various experiment results for qualitative analysis and future research, (3) proposing a new sampling approach for text generation, and (4) publishing our fine-tuned GPT-2 model and sample code for future researchers to run on Colab.
UR - http://www.scopus.com/inward/record.url?scp=85089462589&partnerID=8YFLogxK
U2 - 10.1016/j.wpi.2020.101983
DO - 10.1016/j.wpi.2020.101983
M3 - Article
AN - SCOPUS:85089462589
SN - 0172-2190
VL - 62
SP - 1
EP - 7
JO - World Patent Information
JF - World Patent Information
M1 - 101983
ER -