Selection Informative Units for Extractive Summarization

Yükleniyor...
Küçük Resim

Tarih

2023

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

World Scientific and Engineering Academy and Society

Erişim Hakkı

info:eu-repo/semantics/embargoedAccess

Özet

An Extractive Multi-Document Summarizer must select the most informative units and prevents duplication in extraction. In order to achieve this goal, a new technique, called “comprising at least one Representative Term at the Highest Frequency”, called RTHF, is proposed in this work. The units which include representative terms, but with low frequencies are not considered for extraction (selection of the most informative units). On the other hand, these units which provide RTHF feature, precede other similar units in ranking (prevents duplication). The heuristic behind the RTHF is explained by probability. RTHF was experimented on a previously developed and tested paragraph- based Extractive Multi-Document Summarizer. The results show that it enhances the original system by 0.8% ~ 3.2% (Average-F values of ROUGE metrics).

Açıklama

Anahtar Kelimeler

Document Summarization, Informative Units, TF-IDF, Paragraph Extraction, NLP, AI

Kaynak

WSEAS Transactions on Systems

WoS Q Değeri

Scopus Q Değeri

N/A

Cilt

22

Sayı

Künye