98 research outputs found

    Minimizing the error of linear separators on linearly inseparable data

    Get PDF
    Given linearly inseparable sets R of red points and B of blue points, we consider several measures of how far they are from being separable. Intuitively, given a potential separator (‘‘classifier’’), we measure its quality (‘‘error’’) according to how much work it would take to move the misclassified points across the classifier to yield separated sets. We consider several measures of work and provide algorithms to find linear classifiers that minimize the error under these different measures.Ministerio de Educación y Ciencia MTM2008-05866-C03-0

    Classification algorithms on the cell processor

    Get PDF
    The rapid advancement in the capacity and reliability of data storage technology has allowed for the retention of virtually limitless quantity and detail of digital information. Massive information databases are becoming more and more widespread among governmental, educational, scientific, and commercial organizations. By segregating this data into carefully defined input (e.g.: images) and output (e.g.: classification labels) sets, a classification algorithm can be used develop an internal expert model of the data by employing a specialized training algorithm. A properly trained classifier is capable of predicting the output for future input data from the same input domain that it was trained on. Two popular classifiers are Neural Networks and Support Vector Machines. Both, as with most accurate classifiers, require massive computational resources to carry out the training step and can take months to complete when dealing with extremely large data sets. In most cases, utilizing larger training improves the final accuracy of the trained classifier. However, access to the kinds of computational resources required to do so is expensive and out of reach of private or under funded institutions. The Cell Broadband Engine (CBE), introduced by Sony, Toshiba, and IBM has recently been introduced into the market. The current most inexpensive iteration is available in the Sony Playstation 3 ® computer entertainment system. The CBE is a novel multi-core architecture which features many hardware enhancements designed to accelerate the processing of massive amounts of data. These characteristics and the cheap and widespread availability of this technology make the Cell a prime candidate for the task of training classifiers. In this work, the feasibility of the Cell processor in the use of training Neural Networks and Support Vector Machines was explored. In the Neural Network family of classifiers, the fully connected Multilayer Perceptron and Convolution Network were implemented. In the Support Vector Machine family, a Working Set technique known as the Gradient Projection-based Decomposition Technique, as well as the Cascade SVM were implemented

    Separating bichromatic point sets in the plane by restricted orientation convex hulls

    Get PDF
    The version of record is available online at: http://dx.doi.org/10.1007/s10898-022-01238-9We explore the separability of point sets in the plane by a restricted-orientation convex hull, which is an orientation-dependent, possibly disconnected, and non-convex enclosing shape that generalizes the convex hull. Let R and B be two disjoint sets of red and blue points in the plane, and O be a set of k=2 lines passing through the origin. We study the problem of computing the set of orientations of the lines of O for which the O-convex hull of R contains no points of B. For k=2 orthogonal lines we have the rectilinear convex hull. In optimal O(nlogn) time and O(n) space, n=|R|+|B|, we compute the set of rotation angles such that, after simultaneously rotating the lines of O around the origin in the same direction, the rectilinear convex hull of R contains no points of B. We generalize this result to the case where O is formed by k=2 lines with arbitrary orientations. In the counter-clockwise circular order of the lines of O, let ai be the angle required to clockwise rotate the ith line so it coincides with its successor. We solve the problem in this case in O(1/T·NlogN) time and O(1/T·N) space, where T=min{a1,…,ak} and N=max{k,|R|+|B|}. We finally consider the case in which O is formed by k=2 lines, one of the lines is fixed, and the second line rotates by an angle that goes from 0 to p. We show that this last case can also be solved in optimal O(nlogn) time and O(n) space, where n=|R|+|B|.Carlos Alegría: Research supported by MIUR Proj. “AHeAD” no 20174LF3T8. David Orden: Research supported by Project PID2019-104129GB-I00 / AEI / 10.13039/501100011033 of the Spanish Ministry of Science and Innovation. Carlos Seara: Research supported by Project PID2019-104129GB-I00 / AEI / 10.13039/501100011033 of the Spanish Ministry of Science and Innovation. Jorge Urrutia: Research supported in part by SEP-CONACYThis project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska–Curie Grant Agreement No 734922.Peer ReviewedPostprint (published version

    Efficient piecewise linear classifiers and applications

    Get PDF
    Supervised learning has become an essential part of data mining for industry, military, science and academia. Classification, a type of supervised learning allows a machine to learn from data to then predict certain behaviours, variables or outcomes. Classification can be used to solve many problems including the detection of malignant cancers, potentially bad creditors and even enabling autonomy in robots. The ability to collect and store large amounts of data has increased significantly over the past few decades. However, the ability of classification techniques to deal with large scale data has not been matched. Many data transformation and reduction schemes have been tried with mixed success. This problem is further exacerbated when dealing with real time classification in embedded systems. The real time classifier must classify using only limited processing, memory and power resources. Piecewise linear boundaries are known to provide efficient real time classifiers. They have low memory requirements, require little processing effort, are parameterless and classify in real time. Piecewise linear functions are used to approximate non-linear decision boundaries between pattern classes. Finding these piecewise linear boundaries is a difficult optimization problem that can require a long training time. Multiple optimization approaches have been used for real time classification, but can lead to suboptimal piecewise linear boundaries. This thesis develops three real time piecewise linear classifiers that deal with large scale data. Each classifier uses a single optimization algorithm in conjunction with an incremental approach that reduces the number of points as the decision boundaries are built. Two of the classifiers further reduce complexity by augmenting the incremental approach with additional schemes. One scheme uses hyperboxes to identify points inside the so-called “indeterminate” regions. The other uses a polyhedral conic set to identify data points lying on or close to the boundary. All other points are excluded from the process of building the decision boundaries. The three classifiers are applied to real time data classification problems and the results of numerical experiments on real world data sets are reported. These results demonstrate that the new classifiers require a reasonable training time and their test set accuracy is consistently good on most data sets compared with current state of the art classifiers.Doctor of Philosoph
    corecore