Bag encoding strategies in multiple instance learning problems
No Thumbnail Available
Date
2018
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier Inc.
Access Rights
info:eu-repo/semantics/closedAccess
Abstract
Multiple instance learning (MIL) deals with supervised learning tasks, where the aim is to learn from a set of labeled bags containing certain number of instances. In MIL setting, instance label information is unavailable, which makes it difficult to apply regular supervised learning. To resolve this problem, researchers devise methods focusing on certain assumptions regarding the instance labels. However, it is not a trivial task to determine which assumption holds for a new type of MIL problem. A bag-level representation based on instance characteristics does not require assumptions about the instance labels and is shown to be successful in MIL tasks. These approaches mainly encode bag vectors using bag-of-features representations. In this paper, we propose tree-based encoding strategies that partition the instance feature space and represent the bags using the frequency of instances residing at each partition. Our encoding implicitly learns generalized Gaussian Mixture Model (GMM) on the instance feature space and transforms this information into a bag-level summary. We show that bag representation using tree ensembles provides fast, accurate and robust representations. Our experiments on a large database of MIL problems show that tree-based encoding is highly scalable, and its performance is competitive with the state-of-the-art algorithms. © 2018 Elsevier Inc.
Description
Keywords
Bag encoding, Classification, Decision trees, Multiple instance learning
Journal or Series
Information Sciences
WoS Q Value
Q1
Scopus Q Value
Q1
Volume
467