Enhanced Named Entity Recognition algorithm for financial document verification

Yükleniyor...
Küçük Resim

Tarih

2023

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Springer

Erişim Hakkı

info:eu-repo/semantics/embargoedAccess

Özet

Many enterprise systems are document-intensive and require extensive manual verifcation. The verifcation process has challenge in terms of time and remaining bugs. A general automatic or semi-automatic document verifcation system would be useful. However, as the nature of the natural language, the context is an important factor. In this research, the target context is selected to be the fnancial documents, which have been highly interested recently. An automatic document verifcation model based on only entities (mostly faced within fnancial documents) was experimented. The summary report was verifed with original documents, such that enti ties in the summary were searched for matching in the original documents. Verifca tion process success was evaluated by comparison of the named entity algorithms in the literature. The special Kaggle data set ready for this purpose was used for entity matching from the summary within the original documents. The average document verifcation accuracy of named entity fnding algorithms for only fnancial type documents was 85.36%, where the proposed entity recognition algorithm reached 88.80%. On the other hand, the average document verifcation time of the experi mented algorithms and the developed algorithm is 2.43 and 2.48 s respectively. As a conclusion, when both the BERT-base-cased classifcation model and rule-based approaches are applied specifc to the context, it enhances the entity verifcation process with an insignifcant time cost. Consequently, even we used limited data and rules, it is seen that there exists opportunity to automatize the document verifcation process with the support of both the BERT-base-cased classifcation model and rulebased approaches.

Açıklama

Anahtar Kelimeler

Automatic document verifcation, Named Entity Recognition, Document summarization, Spell-checker, Natural language processing

Kaynak

Journal of Supercomputing

WoS Q Değeri

Q2

Scopus Q Değeri

N/A

Cilt

79

Sayı

17

Künye