2,986 research outputs found

    Oversampling for Imbalanced Learning Based on K-Means and SMOTE

    Full text link
    Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to tackle this problem, methods which generate artificial data to achieve a balanced class distribution are more versatile than modifications to the classification algorithm. Such techniques, called oversamplers, modify the training data, allowing any classifier to be used with class-imbalanced datasets. Many algorithms have been proposed for this task, but most are complex and tend to generate unnecessary noise. This work presents a simple and effective oversampling method based on k-means clustering and SMOTE oversampling, which avoids the generation of noise and effectively overcomes imbalances between and within classes. Empirical results of extensive experiments with 71 datasets show that training data oversampled with the proposed method improves classification results. Moreover, k-means SMOTE consistently outperforms other popular oversampling methods. An implementation is made available in the python programming language.Comment: 19 pages, 8 figure

    A Faster Parameterized Algorithm for Treedepth

    Full text link
    The width measure \emph{treedepth}, also known as vertex ranking, centered coloring and elimination tree height, is a well-established notion which has recently seen a resurgence of interest. We present an algorithm which---given as input an nn-vertex graph, a tree decomposition of the graph of width ww, and an integer tt---decides Treedepth, i.e. whether the treedepth of the graph is at most tt, in time 2O(wt)n2^{O(wt)} \cdot n. If necessary, a witness structure for the treedepth can be constructed in the same running time. In conjunction with previous results we provide a simple algorithm and a fast algorithm which decide treedepth in time 22O(t)n2^{2^{O(t)}} \cdot n and 2O(t2)n2^{O(t^2)} \cdot n, respectively, which do not require a tree decomposition as part of their input. The former answers an open question posed by Ossona de Mendez and Nesetril as to whether deciding Treedepth admits an algorithm with a linear running time (for every fixed tt) that does not rely on Courcelle's Theorem or other heavy machinery. For chordal graphs we can prove a running time of 2O(tlogt)n2^{O(t \log t)}\cdot n for the same algorithm.Comment: An extended abstract was published in ICALP 2014, Track

    Fast Biclustering by Dual Parameterization

    Get PDF
    We study two clustering problems, Starforest Editing, the problem of adding and deleting edges to obtain a disjoint union of stars, and the generalization Bicluster Editing. We show that, in addition to being NP-hard, none of the problems can be solved in subexponential time unless the exponential time hypothesis fails. Misra, Panolan, and Saurabh (MFCS 2013) argue that introducing a bound on the number of connected components in the solution should not make the problem easier: In particular, they argue that the subexponential time algorithm for editing to a fixed number of clusters (p-Cluster Editing) by Fomin et al. (J. Comput. Syst. Sci., 80(7) 2014) is an exception rather than the rule. Here, p is a secondary parameter, bounding the number of components in the solution. However, upon bounding the number of stars or bicliques in the solution, we obtain algorithms which run in time 25pk+O(n+m)2^{5 \sqrt{pk}} + O(n+m) for p-Starforest Editing and 2O(pklog(pk))+O(n+m)2^{O(p \sqrt{k} \log(pk))} + O(n+m) for p-Bicluster Editing. We obtain a similar result for the more general case of t-Partite p-Cluster Editing. This is subexponential in k for fixed number of clusters, since p is then considered a constant. Our results even out the number of multivariate subexponential time algorithms and give reasons to believe that this area warrants further study.Comment: Accepted for presentation at IPEC 201

    Multivalued robust tracking control of fully actuated Lagrange systems: Continuous and discrete–time algorithms

    Get PDF
    International audienceIn this paper the robust trajectory tracking problem of a class of nonlinear systems described by the Euler–Lagrange equations of motion is studied. We start considering a plant under the effects of an unknown external perturbation and also with uncertainties on its parameters. After that a class of passivity-based multivalued control laws is proposed and the well–posedness together with the stability of the closed–loop are established in the continuous–time setting. The discrete–time version of the plant and the controller are studied and well–posedness together with stability results are obtained, using the so-called implicit discretization approach introduced in [1, 2]. Numerical simulations are presented and demonstrate the effectiveness of the proposed discrete-time controller

    Set-valued sliding-mode control of uncertain linear systems: continuous and discrete-time analysis

    Get PDF
    International audienceIn this paper we study the closed-loop dynamics of linear time-invariant systems with feedback control laws that are described by set-valued maximal monotone maps. The class of systems considered in this work is subject to both, unknown exogenous disturbances and parameter uncertainty. It is shown how the design of conventional sliding mode controllers can be achieved using maximal monotone operators (which include the set-valued signum function). Two cases are analyzed: continuous-time and discrete-time controllers. In both cases well-posedness together with stability results are presented. In discrete time we show how the implicit scheme proposed for the selection of control actions makes sense resulting in the chattering effect being almost suppressed even with uncertainty in the system
    corecore