Search CORE

1,962 research outputs found

Detecting outlying subspaces for high-dimensional data: the new task, algorithms and performance

Author: Wang Hai
Zhang Ji
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/10/2006
Field of study

[Abstract]: In this paper, we identify a new task for studying the outlying degree (OD) of high-dimensional data, i.e. finding the subspaces (subsets of features) in which the given points are outliers, which are called their outlying subspaces. Since the state-of-the-art outlier detection techniques fail to handle this new problem, we propose a novel detection algorithm, called High-Dimension Outlying subspace Detection (HighDOD), to detect the outlying subspaces of high-dimensional data efficiently. The intuitive idea of HighDOD is that we measure the OD of the point using the sum of distances between this point and its k nearest neighbors. Two heuristic pruning strategies are proposed to realize fast pruning in the subspace search and an efficient dynamic subspace search method with a sample-based learning process has been implemented. Experimental results show that HighDOD is efficient and outperforms other searching alternatives such as the naive top–down, bottom–up and random search methods, and the existing outlier detection methods cannot fulfill this new task effectively

University of Southern Queensland ePrints

Orbitopal Fixing

Author: Achterberg
Apt
Chopra
Chopra
Eisenblätter
Faenza
Fahle
Falkner
Ferreira
Ferreira
Friedman
Ghaddar
Grötschel
Grötschel
Hentenryck
Kaibel
Kaibel
Kochenberger
Marc E. Pfetsch
Margot
Margot
Margot
Margot
Margot
Marriott
Matthias Peinhardt
McKay
Mehrotra
Ostrowski
Ostrowski
Ostrowski
Puget
Volker Kaibel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

The topic of this paper are integer programming models in which a subset of 0/1-variables encode a partitioning of a set of objects into disjoint subsets. Such models can be surprisingly hard to solve by branch-and-cut algorithms if the order of the subsets of the partition is irrelevant, since this kind of symmetry unnecessarily blows up the search tree. We present a general tool, called orbitopal fixing, for enhancing the capabilities of branch-and-cut algorithms in solving such symmetric integer programming models. We devise a linear time algorithm that, applied at each node of the search tree, removes redundant parts of the tree produced by the above mentioned symmetry. The method relies on certain polyhedra, called orbitopes, which have been introduced bei Kaibel and Pfetsch (Math. Programm. A, 114 (2008), 1-36). It does, however, not explicitly add inequalities to the model. Instead, it uses certain fixing rules for variables. We demonstrate the computational power of orbitopal fixing at the example of a graph partitioning problem.Comment: 22 pages, revised and extended version of a previous version that has appeared under the same title in Proc. IPCO 200

arXiv.org e-Print Archive

TUbiblio

Elsevier - Publisher Connector

Crossref

Using rule extraction to improve the comprehensibility of predictive models.

Author: Baesens Bart
Huysmans Johan
Vanthienen Jan
Publication venue
Publication date
Field of study

Whereas newer machine learning techniques, like artifficial neural net-works and support vector machines, have shown superior performance in various benchmarking studies, the application of these techniques remains largely restricted to research environments. A more widespread adoption of these techniques is foiled by their lack of explanation capability which is required in some application areas, like medical diagnosis or credit scoring. To overcome this restriction, various algorithms have been proposed to extract a meaningful description of the underlying `blackbox' models. These algorithms' dual goal is to mimic the behavior of the black box as closely as possible while at the same time they have to ensure that the extracted description is maximally comprehensible. In this research report, we first develop a formal definition of`rule extraction and comment on the inherent trade-off between accuracy and comprehensibility. Afterwards, we develop a taxonomy by which rule extraction algorithms can be classiffied and discuss some criteria by which these algorithms can be evaluated. Finally, an in-depth review of the most important algorithms is given.This report is concluded by pointing out some general shortcomings of existing techniques and opportunities for future research.Models; Model; Algorithms; Criteria; Opportunities; Research; Learning; Neural networks; Networks; Performance; Benchmarking; Studies; Area; Credit; Credit scoring; Behavior; Time;

Research Papers in Economics

Pruning Attributes From Data Cubes with Diamond Dicing

Author: Kaser Owen
Lemire Daniel
Webb Hazel
Publication venue: ACM International Conference Proceeding Series
Publication date: 01/06/2008
Field of study

Data stored in a data warehouse are inherently multidimensional, but most data-pruning techniques (such as iceberg and top-k queries) are unidimensional. However, analysts need to issue multidimensional queries. For example, an analyst may need to select not just the most profitable stores or--separately--the most profitable products, but simultaneous sets of stores and products fulfilling some profitability constraints. To fill this need, we propose a new operator, the diamond dice. Because of the interaction between dimensions, the computation of diamonds is challenging. We present the first diamond-dicing experiments on large data sets. Experiments show that we can compute diamond cubes over fact tables containing 100 million facts in less than 35 minutes using a standard PC

R-libre

A Novel Method for the Absolute Pose Problem with Pairwise Constraints

Author: Chen Guang
Knoll Alois
Li Xuechen
Liu Yinlong
Song Zhijian
Wang Manning
Publication venue: 'MDPI AG'
Publication date: 28/03/2019
Field of study

Absolute pose estimation is a fundamental problem in computer vision, and it is a typical parameter estimation problem, meaning that efforts to solve it will always suffer from outlier-contaminated data. Conventionally, for a fixed dimensionality d and the number of measurements N, a robust estimation problem cannot be solved faster than O(N^d). Furthermore, it is almost impossible to remove d from the exponent of the runtime of a globally optimal algorithm. However, absolute pose estimation is a geometric parameter estimation problem, and thus has special constraints. In this paper, we consider pairwise constraints and propose a globally optimal algorithm for solving the absolute pose estimation problem. The proposed algorithm has a linear complexity in the number of correspondences at a given outlier ratio. Concretely, we first decouple the rotation and the translation subproblems by utilizing the pairwise constraints, and then we solve the rotation subproblem using the branch-and-bound algorithm. Lastly, we estimate the translation based on the known rotation by using another branch-and-bound algorithm. The advantages of our method are demonstrated via thorough testing on both synthetic and real-world dataComment: 10 pages, 7figure

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute