9 research outputs found

    Large dataset complexity reduction for classification: An optimization perspective

    Get PDF
    Doctor of PhilosophyComputational complexity in data mining is attributed to algorithms but lies hugely with the data. Different algorithms may exist to solve the same problem, but the simplest is not always the best. At the same time, data of astronomical proportions is rather common, boosted by automation, and the fuller the data, the better resolution of the concept it projects. Paradoxically, it is the computing power that is lacking. Perhaps a fast algorithm can be run on the data, but not the optimal. Even then any modeling is much constrained, involving serial application of many algorithms. The only other way to relieve the computational load is via making the data lighter. Any representative subset has to preserve the data essence suiting, ideally, any algorithm. The reduction should minimize the error of approximation, while trading precision for performance. Data mining is a wide field. We concentrate on classification. In the literature review we present a variety of methods, emphasizing the effort of past decade. Two major objects of reduction are instances and attributes. The data can be also recast into a more economical format. We address sampling, noise reduction, class domain binarization, feature ranking, feature subset selection, feature extraction, and also discretization of continuous features. Achievements are tremendous, but so are possibilities. We improve an existing technique of data cleansing and suggest a way of data condensing as the extension. We also touch on noise reduction. Instance similarity, excepting the class mix, prompts a technique of feature selection. Additionally, we consider multivariate discretization, enabling a compact data representation without the size change. We compare proposed methods with alternative techniques which we introduce new, implement or use available

    QUANTUM COMPUTING AND HPC TECHNIQUES FOR SOLVING MICRORHEOLOGY AND DIMENSIONALITY REDUCTION PROBLEMS

    Get PDF
    Tesis doctoral en período de exposición públicaDoctorado en Informática (RD99/11)(8908

    Hyperbolic smoothing in nonsmooth optimization and applications

    Get PDF
    Nonsmooth nonconvex optimization problems arise in many applications including economics, business and data mining. In these applications objective functions are not necessarily differentiable or convex. Many algorithms have been proposed over the past three decades to solve such problems. In spite of the significant growth in this field, the development of efficient algorithms for solving this kind of problem is still a challenging task. The subgradient method is one of the simplest methods developed for solving these problems. Its convergence was proved only for convex objective functions. This method does not involve any subproblems, neither for finding search directions nor for computation of step lengths, which are fixed ahead of time. Bundle methods and their various modifications are among the most efficient methods for solving nonsmooth optimization problems. These methods involve a quadratic programming subproblem to find search directions. The size of the subproblem may increase significantly with the number of variables, which makes the bundle-type methods unsuitable for large scale nonsmooth optimization problems. The implementation of bundle-type methods, which require the use of the quadratic programming solvers, is not as easy as the implementation of the subgradient methods. Therefore it is beneficial to develop algorithms for nonsmooth nonconvex optimization which are easy to implement and more efficient than the subgradient methods. In this thesis, we develop two new algorithms for solving nonsmooth nonconvex optimization problems based on the use of the hyperbolic smoothing technique and apply them to solve the pumping cost minimization problem in water distribution. Both algorithms use smoothing techniques. The first algorithm is designed for solving finite minimax problems. In order to apply the hyperbolic smoothing we reformulate the objective function in the minimax problem and study the relationship between the original minimax and reformulated problems. We also study the main properties of the hyperbolic smoothing function. Based on these results an algorithm for solving the finite minimax problem is proposed and this algorithm is implemented in GAMS. We present preliminary results of numerical experiments with well-known nonsmooth optimization test problems. We also compare the proposed algorithm with the algorithm that uses the exponential smoothing function as well as with the algorithm based on nonlinear programming reformulation of the finite minimax problem. The second nonsmooth optimization algorithm we developed was used to demonstrate how smooth optimization methods can be applied to solve general nonsmooth (nonconvex) optimization problems. In order to do so we compute subgradients from some neighborhood of the current point and define a system of linear inequalities using these subgradients. Search directions are computed by solving this system. This system is solved by reducing it to the minimization of the convex piecewise linear function over the unit ball. Then the hyperbolic smoothing function is applied to approximate this minimization problem by a sequence of smooth problems which are solved by smooth optimization methods. Such an approach allows one to apply powerful smooth optimization algorithms for solving nonsmooth optimization problems and extend smoothing techniques for solving general nonsmooth nonconvex optimization problems. The convergence of the algorithm based on this approach is studied. The proposed algorithm was implemented in Fortran 95. Preliminary results of numerical experiments are reported and the proposed algorithm is compared with an other five nonsmooth optimization algorithms. We also implement the algorithm in GAMS and compare it with GAMS solvers using results of numerical experiments.Doctor of Philosoph
    corecore