Enhanced Named Entity Recognition algorithm for financial document verification
Yükleniyor...
Tarih
2023
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Springer
Erişim Hakkı
info:eu-repo/semantics/embargoedAccess
Özet
Many enterprise systems are document-intensive and require extensive manual
verifcation. The verifcation process has challenge in terms of time and remaining
bugs. A general automatic or semi-automatic document verifcation system would
be useful. However, as the nature of the natural language, the context is an important
factor. In this research, the target context is selected to be the fnancial documents,
which have been highly interested recently. An automatic document verifcation
model based on only entities (mostly faced within fnancial documents) was experimented. The summary report was verifed with original documents, such that enti ties in the summary were searched for matching in the original documents. Verifca tion process success was evaluated by comparison of the named entity algorithms in
the literature. The special Kaggle data set ready for this purpose was used for entity
matching from the summary within the original documents. The average document
verifcation accuracy of named entity fnding algorithms for only fnancial type
documents was 85.36%, where the proposed entity recognition algorithm reached
88.80%. On the other hand, the average document verifcation time of the experi mented algorithms and the developed algorithm is 2.43 and 2.48 s respectively. As
a conclusion, when both the BERT-base-cased classifcation model and rule-based
approaches are applied specifc to the context, it enhances the entity verifcation process with an insignifcant time cost. Consequently, even we used limited data and
rules, it is seen that there exists opportunity to automatize the document verifcation
process with the support of both the BERT-base-cased classifcation model and rulebased approaches.
Açıklama
Anahtar Kelimeler
Automatic document verifcation, Named Entity Recognition, Document summarization, Spell-checker, Natural language processing
Kaynak
Journal of Supercomputing
WoS Q Değeri
Q2
Scopus Q Değeri
N/A
Cilt
79
Sayı
17