Bag encoding strategies in multiple instance learning problems

No Thumbnail Available

Date

2018

Journal Title

Journal ISSN

Volume Title

Publisher

Elsevier Inc.

Access Rights

info:eu-repo/semantics/closedAccess

Abstract

Multiple instance learning (MIL) deals with supervised learning tasks, where the aim is to learn from a set of labeled bags containing certain number of instances. In MIL setting, instance label information is unavailable, which makes it difficult to apply regular supervised learning. To resolve this problem, researchers devise methods focusing on certain assumptions regarding the instance labels. However, it is not a trivial task to determine which assumption holds for a new type of MIL problem. A bag-level representation based on instance characteristics does not require assumptions about the instance labels and is shown to be successful in MIL tasks. These approaches mainly encode bag vectors using bag-of-features representations. In this paper, we propose tree-based encoding strategies that partition the instance feature space and represent the bags using the frequency of instances residing at each partition. Our encoding implicitly learns generalized Gaussian Mixture Model (GMM) on the instance feature space and transforms this information into a bag-level summary. We show that bag representation using tree ensembles provides fast, accurate and robust representations. Our experiments on a large database of MIL problems show that tree-based encoding is highly scalable, and its performance is competitive with the state-of-the-art algorithms. © 2018 Elsevier Inc.

Description

Keywords

Bag encoding, Classification, Decision trees, Multiple instance learning

Journal or Series

Information Sciences

WoS Q Value

Q1

Scopus Q Value

Q1

Volume

467

Issue

Citation