97 research outputs found

    Parallel coordinate descent for the Adaboost problem

    Full text link
    We design a randomised parallel version of Adaboost based on previous studies on parallel coordinate descent. The algorithm uses the fact that the logarithm of the exponential loss is a function with coordinate-wise Lipschitz continuous gradient, in order to define the step lengths. We provide the proof of convergence for this randomised Adaboost algorithm and a theoretical parallelisation speedup factor. We finally provide numerical examples on learning problems of various sizes that show that the algorithm is competitive with concurrent approaches, especially for large scale problems.Comment: 7 pages, 3 figures, extended version of the paper presented to ICMLA'1

    Robust Framework to Combine Diverse Classifiers Assigning Distributed Confidence to Individual Classifiers at Class Level

    Get PDF
    We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension of m-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes

    A Near Real-Time, Highly Scalable, Parallel and Distributed Adaptive Object Detection and Re-Training Framework Based on the Adaboost Algorithm

    Get PDF
    Object detection, such as face detection using supervised learning, often requires extensive training for the computer, which results in high execution times. If the trained system needs re-training in order to accommodate a missed detection, waiting several hours or days before the system is ready may be unacceptable in practical implementations. This dissertation presents a generalized object detection framework whereby the system can efficiently adapt to misclassified data and be re-trained within a few minutes. Our developed methodology is based on the popular AdaBoost algorithm for object detection. AdaBoost functions by iteratively selecting the best among weak classifiers, and then combining several weak classifiers in order to obtain a stronger classifier. Even though AdaBoost has proven to be very effective, its learning execution time can be high depending upon the application. For example, in face detection, learning can take several days. In our dissertation, we present two techniques that contribute to reducing to the learning execution time within the AdaBoost algorithm. Our first technique utilizes a highly parallel and distributed AdaBoost algorithm that exploits the multiple cores in a CPU via lightweight threads. In addition, our technique uses multiple machines in a web service similar to a map-reduce architecture in order to achieve a high scalability, which results in a training execution time of a few minutes rather than several days. Our second technique is a methodology to create an optimal training subset to further reduce the training execution time. We obtained this subset through a novel score-keeping of the weight distribution within the AdaBoost algorithm, and then removed the images that had a minimal effect on the overall trained classifier. Finally, we incorporated our parallel and distributed AdaBoost algorithm, along with the optimized training subset, into a generalized object detection framework that efficiently adapts and makes corrections when it encounters misclassified data. We demonstrated the usefulness of our adaptive framework by providing detailed testing on face and car detection, and explained how our framework applies to developing any other object detection task

    Boosting en el modelo de aprendizaje PAC

    Get PDF
    A review on the idea of Boosting in the PAC learning model is presented. Also a review of the ïŹrst practical Boosting method, the adaptative boosting (Adaboost) is provided, giving details concerning theoretical garantees on error convergence and exploring the important concept of margin.Una revisiĂłn de la idea de Boosting en el modelo de aprendizaje PAC es presentada. Adicionalmente se provee una revisiĂłn del primer mĂ©todo de Boosting prĂĄctico, el Boosting adaptativo (Adaboost), dando detalles respecto a las garantĂ­as teĂłricas en la convergencia del error y explorando el importante concepto de margen

    On the Intersection of Communication and Machine Learning

    Get PDF
    The intersection of communication and machine learning is attracting increasing interest from both communities. On the one hand, the development of modern communication system brings large amount of data and high performance requirement, which challenges the classic analytical-derivation based study philosophy and encourages the researchers to explore the data driven method, such as machine learning, to solve the problems with high complexity and large scale. On the other hand, the usage of distributed machine learning introduces the communication cost as one of the basic considerations for the design of machine learning algorithm and system.In this thesis, we first explore the application of machine learning on one of the classic problems in wireless network, resource allocation, for heterogeneous millimeter wave networks when the environment is with high dynamics. We address the practical concerns by providing the efficient online and distributed framework. In the second part, some sampling based communication-efficient distributed learning algorithm is proposed. We utilize the trade-off between the local computation and the total communication cost and propose the algorithm with good theoretical bound. In more detail, this thesis makes the following contributionsWe introduced an reinforcement learning framework to solve the resource allocation problems in heterogeneous millimeter wave network. The large state/action space is decomposed according to the topology of the network and solved by an efficient distribtued message passing algorithm. We further speed up the inference process by an online updating process.We proposed the distributed coreset based boosting framework. An efficient coreset construction algorithm is proposed based on the prior knowledge provided by clustering. Then the coreset is integrated with boosting with improved convergence rate. We extend the proposed boosting framework to the distributed setting, where the communication cost is reduced by the good approximation of coreset.We propose an selective sampling framework to construct a subset of sample that could effectively represent the model space. Based on the prior distribution of the model space or the large amount of samples from model space, we derive a computational efficient method to construct such subset by minimizing the error of classifying a classifier

    Approximation and Relaxation Approaches for Parallel and Distributed Machine Learning

    Get PDF
    Large scale machine learning requires tradeoffs. Commonly this tradeoff has led practitioners to choose simpler, less powerful models, e.g. linear models, in order to process more training examples in a limited time. In this work, we introduce parallelism to the training of non-linear models by leveraging a different tradeoff--approximation. We demonstrate various techniques by which non-linear models can be made amenable to larger data sets and significantly more training parallelism by strategically introducing approximation in certain optimization steps. For gradient boosted regression tree ensembles, we replace precise selection of tree splits with a coarse-grained, approximate split selection, yielding both faster sequential training and a significant increase in parallelism, in the distributed setting in particular. For metric learning with nearest neighbor classification, rather than explicitly train a neighborhood structure we leverage the implicit neighborhood structure induced by task-specific random forest classifiers, yielding a highly parallel method for metric learning. For support vector machines, we follow existing work to learn a reduced basis set with extremely high parallelism, particularly on GPUs, via existing linear algebra libraries. We believe these optimization tradeoffs are widely applicable wherever machine learning is put in practice in large scale settings. By carefully introducing approximation, we also introduce significantly higher parallelism and consequently can process more training examples for more iterations than competing exact methods. While seemingly learning the model with less precision, this tradeoff often yields noticeably higher accuracy under a restricted training time budget

    Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

    Get PDF
    The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

    Automatic Image Annotation Based on Particle Swarm Optimization and Support Vector Clustering

    Get PDF
    With the progress of network technology, there are more and more digital images of the internet. But most images are not semantically marked, which makes it difficult to retrieve and use. In this paper, a new algorithm is proposed to automatically annotate images based on particle swarm optimization (PSO) and support vector clustering (SVC). The algorithm includes two stages: firstly, PSO algorithm is used to optimize SVC; secondly, the trained SVC algorithm is used to annotate the image automatically. In the experiment, three datasets are used to evaluate the algorithm, and the results show the effectiveness of the algorithm
    • 

    corecore