Search CORE

4,598 research outputs found

Unsupervised feature construction for improving data representation and semantics

Author: Lallich S
Rizoiu MA
Velcin J
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2013
Field of study

Attribute-based format is the main data representation format used by machine learning algorithms. When the attributes do not properly describe the initial data, performance starts to degrade. Some algorithms address this problem by internally changing the representation space, but the newly constructed features rarely have any meaning. We seek to construct, in an unsupervised way, new attributes that are more appropriate for describing a given dataset and, at the same time, comprehensible for a human user. We propose two algorithms that construct the new attributes as conjunctions of the initial primitive attributes or their negations. The generated feature sets have reduced correlations between features and succeed in catching some of the hidden relations between individuals in a dataset. For example, a feature like sky \wedge \neg building \wedge panorama would be true for non-urban images and is more informative than simple features expressing the presence or the absence of an object. The notion of Pareto optimality is used to evaluate feature sets and to obtain a balance between total correlation and the complexity of the resulted feature set. Statistical hypothesis testing is employed in order to automatically determine the values of the parameters used for constructing a data-dependent feature set. We experimentally show that our approaches achieve the construction of informative feature sets for multiple datasets. © 2013 Springer Science+Business Media New York

OPUS - University of Technology Sydney

On Algorithms and Complexity for Sets with Cardinality Constraints

Author: Kuncak Viktor
Marnette Bruno
Rinard Martin
Publication venue
Publication date: 01/01/2005
Field of study

Typestate systems ensure many desirable properties of imperative programs, including initialization of object fields and correct use of stateful library interfaces. Abstract sets with cardinality constraints naturally generalize typestate properties: relationships between the typestates of objects can be expressed as subset and disjointness relations on sets, and elements of sets can be represented as sets of cardinality one. Motivated by these applications, this paper presents new algorithms and new complexity results for constraints on sets and their cardinalities. We study several classes of constraints and demonstrate a trade-off between their expressive power and their complexity. Our first result concerns a quantifier-free fragment of Boolean Algebra with Presburger Arithmetic. We give a nondeterministic polynomial-time algorithm for reducing the satisfiability of sets with symbolic cardinalities to constraints on constant cardinalities, and give a polynomial-space algorithm for the resulting problem. In a quest for more efficient fragments, we identify several subclasses of sets with cardinality constraints whose satisfiability is NP-hard. Finally, we identify a class of constraints that has polynomial-time satisfiability and entailment problems and can serve as a foundation for efficient program analysis.Comment: 20 pages. 12 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

DSpace@MIT

Dagstuhl Research Online Publication Server

GIMO : A multi-objective anytime rule mining system to ease iterative feedback from domain experts

Author: Baum Tobias
Herbold Steffen
Schneider Kurt
Publication venue: Amsterdam [u.a.] : Elsevier
Publication date: 01/01/2020
Field of study

Data extracted from software repositories is used intensively in Software Engineering research, for example, to predict defects in source code. In our research in this area, with data from open source projects as well as an industrial partner, we noticed several shortcomings of conventional data mining approaches for classification problems: (1) Domain experts’ acceptance is of critical importance, and domain experts can provide valuable input, but it is hard to use this feedback. (2) Evaluating the quality of the model is not a matter of calculating AUC or accuracy. Instead, there are multiple objectives of varying importance with hard to quantify trade-offs. Furthermore, the performance of the model cannot be evaluated on a per-instance level in our case, because it shares aspects with the set cover problem. To overcome these problems, we take a holistic approach and develop a rule mining system that simplifies iterative feedback from domain experts and can incorporate the domain-specific evaluation needs. A central part of the system is a novel multi-objective anytime rule mining algorithm. The algorithm is based on the GRASP-PR meta-heuristic but extends it with ideas from several other approaches. We successfully applied the system in the industrial context. In the current article, we focus on the description of the algorithm and the concepts of the system. We make an implementation of the system available. © 2020 The Author

Institutionelles Repositorium der Leibniz Universität Hannover

Recommended from our members

Tools for reformulating logical forms into zero-one mixed integer programs (MIPS)

Author: Lucas CA
Mitra G
Moody S
Publication venue: Brunel University
Publication date: 01/01/1992
Field of study

A systematic procedure for transforming a set of logical statements or logical conditions imposed on a model into an Integer Linear Programming (ILP) formulation or a Mixed Integer Programming (MIP) formulation is presented. A reformulation procedure which uses the extended reverse polish representation of a compound logical form is then described. A prototype user interface by which logical forms can be reformulated and the corresponding MIP constructed and analysed within an existing Mathematical Programming modelling system is illustrated. Finally, the steps to formulate a discrete optimisation model in this way are demonstrated by means of an example

Brunel University Research Archive

Recommended from our members

Machine learning : techniques and foundations

Author: Carbonell Jaime G.
Langley Pat
Publication venue: eScholarship, University of California
Publication date: 30/03/1987
Field of study

The field of machine learning studies computational methods for acquiring new knowledge, new skills, and new ways to organize existing knowledge. In this paper we present some of the basic techniques and principles that underlie AI research on learning, including methods for learning from examples, learning in problem solving, learning by analogy, grammar acquisition, and machine discovery. In each case, we illustrate the techniques with paradigmatic examples

eScholarship - University of California

Recommended from our members

Transformation of propositional calculus statements into integer and mixed integer programs: An approach towards automatic reformulation

Author: Hadjiconstantinou E
Publication venue: Brunel University
Publication date: 01/01/1990
Field of study

A systematic procedure for transforming a set of logical statements or logical conditions imposed on a model into an Integer Linear Progamming (ILP) formulation Mixed Integer Programming (MIP) formulation is presented. An ILP stated as a system of linear constraints involving integer variables and an objective function, provides a powerful representation of decision problems through a tightly interrelated closed system of choices. It supports direct representation of logical (Boolean or prepositional calculus) expressions. Binary variables (hereafter called logical variables) are first introduced and methods of logically connecting these to other variables are then presented. Simple constraints can be combined to construct logical relationships and the methods of formulating these are discussed. A reformulation procedure which uses the extended reverse polish representation of a compound logical form is then described. These reformulation procedures are illustrated by two examples. A scheme of implementation.ithin an LP modelling system is outlined

Brunel University Research Archive