Search CORE

9,301 research outputs found

A review of multi-instance learning assumptions

Author: Foulds James Richard
Frank Eibe
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2010
Field of study

Multi-instance (MI) learning is a variant of inductive machine learning, where each learning example contains a bag of instances instead of a single feature vector. The term commonly refers to the supervised setting, where each bag is associated with a label. This type of representation is a natural fit for a number of real-world learning scenarios, including drug activity prediction and image classification, hence many MI learning algorithms have been proposed. Any MI learning method must relate instances to bag-level class labels, but many types of relationships between instances and class labels are possible. Although all early work in MI learning assumes a specific MI concept class known to be appropriate for a drug activity prediction domain; this ‘standard MI assumption’ is not guaranteed to hold in other domains. Much of the recent work in MI learning has concentrated on a relaxed view of the MI problem, where the standard MI assumption is dropped, and alternative assumptions are considered instead. However, often it is not clearly stated what particular assumption is used and how it relates to other assumptions that have been proposed. In this paper, we aim to clarify the use of alternative MI assumptions by reviewing the work done in this area

Research Commons@Waikato

Artifact Lifecycle Discovery

Author: Dumas Marlon
Fahland Dirk
Popova Viara
Publication venue
Publication date: 01/01/2013
Field of study

Artifact-centric modeling is a promising approach for modeling business processes based on the so-called business artifacts - key entities driving the company's operations and whose lifecycles define the overall business process. While artifact-centric modeling shows significant advantages, the overwhelming majority of existing process mining methods cannot be applied (directly) as they are tailored to discover monolithic process models. This paper addresses the problem by proposing a chain of methods that can be applied to discover artifact lifecycle models in Guard-Stage-Milestone notation. We decompose the problem in such a way that a wide range of existing (non-artifact-centric) process discovery and analysis methods can be reused in a flexible manner. The methods presented in this paper are implemented as software plug-ins for ProM, a generic open-source framework and architecture for implementing process mining tools

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Decision rules, trees and tests for tables with many-valued decisions : comparative study

Author: Azad M.
Chikalov I.
Moshkov M.
Zielosko Beata
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

In this paper, we present three approaches for construction of decision rules for decision tables with many-valued decisions. We construct decision rules directly for rows of decision table, based on paths in decision tree, and based on attributes contained in a test (super-reduct). Experimental results for the data sets taken from UCI Machine Learning Repository, contain comparison of the maximum and the average length of rules for the mentioned approaches

Elsevier - Publisher Connector

Repozytorium Uniwersytetu Śląskiego RE-BUŚ

Machine learning methods based on probabilistic decision tree under the multi-valued preference environment

Author: Liu Man
Lu Yi
Zhang Keang
Zhou Wei
Publication venue: Taylor and Francis Group and Juraj Dobrila University of Pula, Faculty of economics and tourism Dr. Mijo Mirković
Publication date: 01/01/2022
Field of study

In the classification calculation, the data are sometimes not unique and there are different values and probabilities. Then, it is meaningful to develop the appropriate methods to make classification decision. To solve this issue, this paper proposes the machine learning methods based on a probabilistic decision tree (DT) under the multi-valued preference environment and the probabilistic multi-valued preference environment respectively for the different classification aims. First, this paper develops a data pre-processing method to deal with the weight and quantity matching under the multi-valued preference environment. In this method, we use the least common multiple and weight assignments to balance the probability of each preference. Then, based on the training data, this paper introduces the entropy method to further optimize the DT model under the multi-valued preference environment. After that, the corresponding calculation rules and probability classifications are given. In addition, considering the different numbers and probabilities of the preferences, this paper also uses the entropy method to develop the DT model under the probabilistic multi-valued preference environment. Furthermore, the calculation rules and probability classifications are similarly derived. At last, we demonstrate the feasibility of the machine learning methods and the DT models under the above two preference environments based on the illustrated examples

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Zero Shot Recognition with Unreliable Attributes

Author: Grauman Kristen
Jayaraman Dinesh
Publication venue
Publication date: 29/03/2016
Field of study

In principle, zero-shot learning makes it possible to train a recognition model simply by specifying the category's attributes. For example, with classifiers for generic attributes like \emph{striped} and \emph{four-legged}, one can construct a classifier for the zebra category by enumerating which properties it possesses---even without providing zebra training images. In practice, however, the standard zero-shot paradigm suffers because attribute predictions in novel images are hard to get right. We propose a novel random forest approach to train zero-shot models that explicitly accounts for the unreliability of attribute predictions. By leveraging statistics about each attribute's error tendencies, our method obtains more robust discriminative models for the unseen classes. We further devise extensions to handle the few-shot scenario and unreliable attribute descriptions. On three datasets, we demonstrate the benefit for visual category learning with zero or few training examples, a critical domain for rare categories or categories defined on the fly.Comment: NIPS 201

arXiv.org e-Print Archive

CiteSeerX

Interpretable Categorization of Heterogeneous Time Series Data

Author: Kochenderfer Mykel J.
Lee Ritchie
Mengshoel Ole J.
Silbermann Joshua
Publication venue
Publication date: 26/01/2018
Field of study

Understanding heterogeneous multivariate time series data is important in many applications ranging from smart homes to aviation. Learning models of heterogeneous multivariate time series that are also human-interpretable is challenging and not adequately addressed by the existing literature. We propose grammar-based decision trees (GBDTs) and an algorithm for learning them. GBDTs extend decision trees with a grammar framework. Logical expressions derived from a context-free grammar are used for branching in place of simple thresholds on attributes. The added expressivity enables support for a wide range of data types while retaining the interpretability of decision trees. In particular, when a grammar based on temporal logic is used, we show that GBDTs can be used for the interpretable classi cation of high-dimensional and heterogeneous time series data. Furthermore, we show how GBDTs can also be used for categorization, which is a combination of clustering and generating interpretable explanations for each cluster. We apply GBDTs to analyze the classic Australian Sign Language dataset as well as data on near mid-air collisions (NMACs). The NMAC data comes from aircraft simulations used in the development of the next-generation Airborne Collision Avoidance System (ACAS X).Comment: 9 pages, 5 figures, 2 tables, SIAM International Conference on Data Mining (SDM) 201

arXiv.org e-Print Archive

Crossref

NASA Technical Reports Server