TY - GEN
T1 - ContributionSum
T2 - 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023
AU - Liu, Meng Huan
AU - Yen, An Zi
AU - Huang, Hen Hsen
AU - Chen, Hsin Hsi
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s).
PY - 2023/10/21
Y1 - 2023/10/21
N2 - Contributions are essentially the core of every scientific research, highlighting their key values to the academic community. Systems that are capable of identifying the contributions from scientific papers precisely and organizing them into well-structured summaries can facilitate both text processing and human comprehension. In this paper, we present ContributionSum, a dataset consisting of 24K computer science papers with contributions explicitly listed by the authors, which are further classified into different contribution types based on a newly-proposed annotation scheme. In addition, we study the task of generating disentangled contributions that summarize the values of scientific papers into key points. We propose a fine-grained post-training strategy tailored to our task and leverage salient information of different contribution types in the papers. To assess the coherency and coverage of each contribution aspect, we perform summary-level and contribution-level evaluations for our task. Experimental results show that our method improves upon mainstream baselines.
AB - Contributions are essentially the core of every scientific research, highlighting their key values to the academic community. Systems that are capable of identifying the contributions from scientific papers precisely and organizing them into well-structured summaries can facilitate both text processing and human comprehension. In this paper, we present ContributionSum, a dataset consisting of 24K computer science papers with contributions explicitly listed by the authors, which are further classified into different contribution types based on a newly-proposed annotation scheme. In addition, we study the task of generating disentangled contributions that summarize the values of scientific papers into key points. We propose a fine-grained post-training strategy tailored to our task and leverage salient information of different contribution types in the papers. To assess the coherency and coverage of each contribution aspect, we perform summary-level and contribution-level evaluations for our task. Experimental results show that our method improves upon mainstream baselines.
KW - Disentangled Contribution Generation
KW - Scholarly Document Processing
KW - Scientific Document Summarization
UR - http://www.scopus.com/inward/record.url?scp=85178166779&partnerID=8YFLogxK
U2 - 10.1145/3583780.3615115
DO - 10.1145/3583780.3615115
M3 - Conference contribution
AN - SCOPUS:85178166779
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 5351
EP - 5355
BT - CIKM 2023 - Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 21 October 2023 through 25 October 2023
ER -