Case study on well-known topic modeling methods for document classification

dc.authorid0000-0002-1941-6693en_US
dc.contributor.authorÖzdemirci, Süleyman
dc.contributor.authorTuran, Metin
dc.date.accessioned2021-12-27T07:33:26Z
dc.date.available2021-12-27T07:33:26Z
dc.date.issued2021en_US
dc.departmentFakülteler, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.description.abstractTopic modeling has numerous applications like text categorization, topic clustering, document tagging, feature extraction on wide document collections. In this study, practical exploration method of topic modeling of Latent Dirichlet Allocation, transformers based machine learning method Bidirectional Encoder Representations from Transformers and Term Frequency — Inverse Document Frequency method were applied to the document set separately. It includes sport and education articles collected from internet by graduate students, 801 number totally. The purpose of this study is to observe which method best suits to the topic modeling and if possible in order to increase the accuracy rate via ensemble of these methods. As a result of this study, it was observed that even it has some disadvantages, BERT classified the documents with the correct topic with an average of %92.6 success ratio, overwhelming the others.en_US
dc.identifier.endpage1309en_US
dc.identifier.isbn978-1-7281-8501-9
dc.identifier.scopus2-s2.0-85102556721en_US
dc.identifier.scopusqualityN/Aen_US
dc.identifier.startpage1304en_US
dc.identifier.urihttps://hdl.handle.net/11467/5139
dc.identifier.wosWOS:000722293800220en_US
dc.identifier.wosqualityN/Aen_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.ispartofProceedings of the Sixth International Conference on Inventive Computation Technologies [ICICT 2021]en_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - İdari Personel ve Öğrencien_US
dc.rightsinfo:eu-repo/semantics/embargoedAccessen_US
dc.subjectClassificationen_US
dc.subjectTopic modelingen_US
dc.subjectLDAen_US
dc.subjectBERTen_US
dc.subjectTF-IDFen_US
dc.titleCase study on well-known topic modeling methods for document classificationen_US
dc.typeConference Objecten_US

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
Case_Study_on_well-known_Topic_Modeling_Methods_for_Document_Classification.pdf
Boyut:
1.41 MB
Biçim:
Adobe Portable Document Format
Açıklama:
Lisans paketi
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
license.txt
Boyut:
1.56 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: