Selection Informative Units for Extractive Summarization
Yükleniyor...
Tarih
2023
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
World Scientific and Engineering Academy and Society
Erişim Hakkı
info:eu-repo/semantics/embargoedAccess
Özet
An Extractive Multi-Document Summarizer must select the most informative units and prevents
duplication in extraction. In order to achieve this goal, a new technique, called “comprising at least one
Representative Term at the Highest Frequency”, called RTHF, is proposed in this work. The units which
include representative terms, but with low frequencies are not considered for extraction (selection of the most
informative units). On the other hand, these units which provide RTHF feature, precede other similar units in
ranking (prevents duplication). The heuristic behind the RTHF is explained by probability. RTHF was
experimented on a previously developed and tested paragraph- based Extractive Multi-Document Summarizer.
The results show that it enhances the original system by 0.8% ~ 3.2% (Average-F values of ROUGE metrics).
Açıklama
Anahtar Kelimeler
Document Summarization, Informative Units, TF-IDF, Paragraph Extraction, NLP, AI
Kaynak
WSEAS Transactions on Systems
WoS Q Değeri
Scopus Q Değeri
N/A
Cilt
22