Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning


Authors / Editors


Research Areas

No matching items found.


Publication Details

Output type: Journal article

UM6P affiliated Publication?: Yes

Author list: Lamsiyah S., Mahdaouy A.E., Ouatik S.E.A., Espinasse B.

Publisher: SAGE Publications (UK and US)

Publication year: 2021

Journal: Journal of Information Science (0165-5515)

ISSN: 0165-5515

eISSN: 1741-6485

URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85101011377&doi=10.1177%2f0165551521990616&partnerID=40&md5=4701131f9d59a323f5c3a90b6bfe606d

Languages: English (EN-GB)


View in Web of Science | View on publisher site | View citing articles in Web of Science


Abstract

Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC’2002–2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning–based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.


Keywords

No matching items found.


Documents

No matching items found.


Last updated on 2021-10-06 at 23:19