Sameep Mehta
Paper download is intended for registered attendees only, and is
subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.
Papers from this author
Budgeted Batch Mode Active Learning with Generalized Cost and Utility Functions
Arvind Agarwal, Shashank Mujumdar, Nitin Gupta, Sameep Mehta
Auto-TLDR; Active Learning Based on Utility and Cost Functions
Abstract Slides Poster Similar
Active learning reduces the labeling cost by actively querying labels for the most valuable data points. Typical active learning methods select the most informative examples one-at-a-time, their batch variants exist where a set of most informative points are selected. These points are selected in such a way that when added to the training data along with their labels, they provide maximum benefit to the underlying model. In this paper, we present a learning framework that actively selects optimal set of examples (in a batch) within a given budget, based on given utility and cost functions. The framework is generic enough to incorporate any utility and any cost function defined on a set of examples. Furthermore, we propose a novel utility function based on the Facility Location problem that considers three important characteristics of utility i.e., diversity, density and point utility. We also propose a novel cost function, by formulating the cost computation problem as an optimization problem, the solution to which turns out to be the minimum spanning tree. Thus, our framework provides the optimal batch of points within the given budget based on the cost and utility functions. We evaluate our method on several data sets and show its superior performance over baseline methods.