187 research outputs found

    Conditional Gradient Methods

    Full text link
    The purpose of this survey is to serve both as a gentle introduction and a coherent overview of state-of-the-art Frank--Wolfe algorithms, also called conditional gradient algorithms, for function minimization. These algorithms are especially useful in convex optimization when linear optimization is cheaper than projections. The selection of the material has been guided by the principle of highlighting crucial ideas as well as presenting new approaches that we believe might become important in the future, with ample citations even of old works imperative in the development of newer methods. Yet, our selection is sometimes biased, and need not reflect consensus of the research community, and we have certainly missed recent important contributions. After all the research area of Frank--Wolfe is very active, making it a moving target. We apologize sincerely in advance for any such distortions and we fully acknowledge: We stand on the shoulder of giants.Comment: 238 pages with many figures. The FrankWolfe.jl Julia package (https://github.com/ZIB-IOL/FrankWolfe.jl) providces state-of-the-art implementations of many Frank--Wolfe method

    A Novel Frank-Wolfe Algorithm. Analysis and Applications to Large-Scale SVM Training

    Full text link
    Recently, there has been a renewed interest in the machine learning community for variants of a sparse greedy approximation procedure for concave optimization known as {the Frank-Wolfe (FW) method}. In particular, this procedure has been successfully applied to train large-scale instances of non-linear Support Vector Machines (SVMs). Specializing FW to SVM training has allowed to obtain efficient algorithms but also important theoretical results, including convergence analysis of training algorithms and new characterizations of model sparsity. In this paper, we present and analyze a novel variant of the FW method based on a new way to perform away steps, a classic strategy used to accelerate the convergence of the basic FW procedure. Our formulation and analysis is focused on a general concave maximization problem on the simplex. However, the specialization of our algorithm to quadratic forms is strongly related to some classic methods in computational geometry, namely the Gilbert and MDM algorithms. On the theoretical side, we demonstrate that the method matches the guarantees in terms of convergence rate and number of iterations obtained by using classic away steps. In particular, the method enjoys a linear rate of convergence, a result that has been recently proved for MDM on quadratic forms. On the practical side, we provide experiments on several classification datasets, and evaluate the results using statistical tests. Experiments show that our method is faster than the FW method with classic away steps, and works well even in the cases in which classic away steps slow down the algorithm. Furthermore, these improvements are obtained without sacrificing the predictive accuracy of the obtained SVM model.Comment: REVISED VERSION (October 2013) -- Title and abstract have been revised. Section 5 was added. Some proofs have been summarized (full-length proofs available in the previous version

    Simplex Clustering via sBeta with Applications to Online Adjustment of Black-Box Predictions

    Full text link
    We explore clustering the softmax predictions of deep neural networks and introduce a novel probabilistic clustering method, referred to as k-sBetas. In the general context of clustering discrete distributions, the existing methods focused on exploring distortion measures tailored to simplex data, such as the KL divergence, as alternatives to the standard Euclidean distance. We provide a general maximum a posteriori (MAP) perspective of clustering distributions, which emphasizes that the statistical models underlying the existing distortion-based methods may not be descriptive enough. Instead, we optimize a mixed-variable objective measuring the conformity of data within each cluster to the introduced sBeta density function, whose parameters are constrained and estimated jointly with binary assignment variables. Our versatile formulation approximates a variety of parametric densities for modeling simplex data, and enables to control the cluster-balance bias. This yields highly competitive performances for unsupervised adjustments of black-box model predictions in a variety of scenarios. Our code and comparisons with the existing simplex-clustering approaches along with our introduced softmax-prediction benchmarks are publicly available: https://github.com/fchiaroni/Clustering_Softmax_Predictions

    Learning vector quantization for proximity data

    Get PDF
    Hofmann D. Learning vector quantization for proximity data. Bielefeld: Universität Bielefeld; 2016.Prototype-based classifiers such as learning vector quantization (LVQ) often display intuitive and flexible classification and learning rules. However, classical techniques are restricted to vectorial data only, and hence not suited for more complex data structures. Therefore, a few extensions of diverse LVQ variants to more general data which are characterized based on pairwise similarities or dissimilarities only have been proposed recently in the literature. In this contribution, we propose a novel extension of LVQ to similarity data which is based on the kernelization of an underlying probabilistic model: kernel robust soft LVQ (KRSLVQ). Relying on the notion of a pseudo-Euclidean embedding of proximity data, we put this specific approach as well as existing alternatives into a general framework which characterizes different fundamental possibilities how to extend LVQ towards proximity data: the main characteristics are given by the choice of the cost function, the interface to the data in terms of similarities or dissimilarities, and the way in which optimization takes place. In particular the latter strategy highlights the difference of popular kernel approaches versus so-called relational approaches. While KRSLVQ and alternatives lead to state of the art results, these extensions have two drawbacks as compared to their vectorial counterparts: (i) a quadratic training complexity is encountered due to the dependency of the methods on the full proximity matrix; (ii) prototypes are no longer given by vectors but they are represented in terms of an implicit linear combination of data, i.e. interpretability of the prototypes is lost. We investigate different techniques to deal with these challenges: We consider a speed-up of training by means of low rank approximations of the Gram matrix by its Nyström approximation. In benchmarks, this strategy is successful if the considered data are intrinsically low-dimensional. We propose a quick check to efficiently test this property prior to training. We extend KRSLVQ by sparse approximations of the prototypes: instead of the full coefficient vectors, few exemplars which represent the prototypes can be directly inspected by practitioners in the same way as data. We compare different paradigms based on which to infer a sparse approximation: sparsity priors while training, geometric approaches including orthogonal matching pursuit and core techniques, and heuristic approximations based on the coefficients or proximities. We demonstrate the performance of these LVQ techniques for benchmark data, reaching state of the art results. We discuss the behavior of the methods to enhance performance and interpretability as concerns quality, sparsity, and representativity, and we propose different measures how to quantitatively evaluate the performance of the approaches. We would like to point out that we had the possibility to present our findings in international publication organs including three journal articles [6, 9, 2], four conference papers [8, 5, 7, 1] and two workshop contributions [4, 3]. References [1] A. Gisbrecht, D. Hofmann, and B. Hammer. Discriminative dimensionality reduction mappings. Advances in Intelligent Data Analysis, 7619: 126–138, 2012. [2] B. Hammer, D. Hofmann, F.-M. Schleif, and X. Zhu. Learning vector quantization for (dis-)similarities. Neurocomputing, 131: 43–51, 2014. [3] D. Hofmann. Sparse approximations for kernel robust soft lvq. Mittweida Workshop on Computational Intelligence, 2013. [4] D. Hofmann, A. Gisbrecht, and B. Hammer. Discriminative probabilistic prototype based models in kernel space. New Challenges in Neural Computation, TR Machine Learning Reports, 2012. [5] D. Hofmann, A. Gisbrecht, and B. Hammer. Efficient approximations of kernel robust soft lvq. Workshop on Self-Organizing Maps, 198: 183–192, 2012. [6] D. Hofmann, A. Gisbrecht, and B. Hammer. Efficient approximations of robust soft learning vector quantization for non-vectorial data. Neurocomputing, 147: 96–106, 2015. [7] D. Hofmann and B. Hammer. Kernel robust soft learning vector quantization. Artificial Neural Networks in Pattern Recognition, 7477: 14–23, 2012. [8] D. Hofmann and B. Hammer. Sparse approximations for kernel learning vector quantization. European Symposium on Artificial Neural Networks, 549–554, 2013. [9] D. Hofmann, F.-M. Schleif, B. Paaßen, and B. Hammer. Learning interpretable kernelized prototype-based models. Neurocomputing, 141: 84–96, 2014

    End-to-End Multiview Gesture Recognition for Autonomous Car Parking System

    Get PDF
    The use of hand gestures can be the most intuitive human-machine interaction medium. The early approaches for hand gesture recognition used device-based methods. These methods use mechanical or optical sensors attached to a glove or markers, which hinders the natural human-machine communication. On the other hand, vision-based methods are not restrictive and allow for a more spontaneous communication without the need of an intermediary between human and machine. Therefore, vision gesture recognition has been a popular area of research for the past thirty years. Hand gesture recognition finds its application in many areas, particularly the automotive industry where advanced automotive human-machine interface (HMI) designers are using gesture recognition to improve driver and vehicle safety. However, technology advances go beyond active/passive safety and into convenience and comfort. In this context, one of America’s big three automakers has partnered with the Centre of Pattern Analysis and Machine Intelligence (CPAMI) at the University of Waterloo to investigate expanding their product segment through machine learning to provide an increased driver convenience and comfort with the particular application of hand gesture recognition for autonomous car parking. In this thesis, we leverage the state-of-the-art deep learning and optimization techniques to develop a vision-based multiview dynamic hand gesture recognizer for self-parking system. We propose a 3DCNN gesture model architecture that we train on a publicly available hand gesture database. We apply transfer learning methods to fine-tune the pre-trained gesture model on a custom-made data, which significantly improved the proposed system performance in real world environment. We adapt the architecture of the end-to-end solution to expand the state of the art video classifier from a single image as input (fed by monocular camera) to a multiview 360 feed, offered by a six cameras module. Finally, we optimize the proposed solution to work on a limited resources embedded platform (Nvidia Jetson TX2) that is used by automakers for vehicle-based features, without sacrificing the accuracy robustness and real time functionality of the system

    User-Centric Active Learning for Outlier Detection

    Get PDF
    Outlier detection searches for unusual, rare observations in large, often high-dimensional data sets. One of the fundamental challenges of outlier detection is that ``unusual\u27\u27 typically depends on the perception of a user, the recipient of the detection result. This makes finding a formal definition of ``unusual\u27\u27 that matches with user expectations difficult. One way to deal with this issue is active learning, i.e., methods that ask users to provide auxiliary information, such as class label annotations, to return algorithmic results that are more in line with the user input. Active learning is well-suited for outlier detection, and many respective methods have been proposed over the last years. However, existing methods build upon strong assumptions. One example is the assumption that users can always provide accurate feedback, regardless of how algorithmic results are presented to them -- an assumption which is unlikely to hold when data is high-dimensional. It is an open question to which extent existing assumptions are in the way of realizing active learning in practice. In this thesis, we study this question from different perspectives with a differentiated, user-centric view on active learning. In the beginning, we structure and unify the research area on active learning for outlier detection. Specifically, we present a rigorous specification of the learning setup, structure the basic building blocks, and propose novel evaluation standards. Throughout our work, this structure has turned out to be essential to select a suitable active learning method, and to assess novel contributions in this field. We then present two algorithmic contributions to make active learning for outlier detection user-centric. First, we bring together two research areas that have been looked at independently so far: outlier detection in subspaces and active learning. Subspace outlier detection are methods to improve outlier detection quality in high-dimensional data, and to make detection results more easy to interpret. Our approach combines them with active learning such that one can balance between detection quality and annotation effort. Second, we address one of the fundamental difficulties with adapting active learning to specific applications: selecting good hyperparameter values. Existing methods to estimate hyperparameter values are heuristics, and it is unclear in which settings they work well. In this thesis, we therefore propose the first principled method to estimate hyperparameter values. Our approach relies on active learning to estimate hyperparameter values, and returns a quality estimate of the values selected. In the last part of the thesis, we look at validating active learning for outlier detection practically. There, we have identified several technical and conceptual challenges which we have experienced firsthand in our research. We structure and document them, and finally derive a roadmap towards validating active learning for outlier detection with user studies

    The ALICE TPC, a large 3-dimensional tracking device with fast readout for ultra-high multiplicity events

    Get PDF
    The design, construction, and commissioning of the ALICE Time-Projection Chamber (TPC) is described. It is the main device for pattern recognition, tracking, and identification of charged particles in the ALICE experiment at the CERN LHC. The TPC is cylindrical in shape with a volume close to 90 m^3 and is operated in a 0.5 T solenoidal magnetic field parallel to its axis. In this paper we describe in detail the design considerations for this detector for operation in the extreme multiplicity environment of central Pb--Pb collisions at LHC energy. The implementation of the resulting requirements into hardware (field cage, read-out chambers, electronics), infrastructure (gas and cooling system, laser-calibration system), and software led to many technical innovations which are described along with a presentation of all the major components of the detector, as currently realized. We also report on the performance achieved after completion of the first round of stand-alone calibration runs and demonstrate results close to those specified in the TPC Technical Design Report.Comment: 55 pages, 82 figure

    Fen Başarısını Yordayan Özelliklerin Sınıflandırma ve Regresyon Ağacı (SRA) Yöntemiyle Belirlenmesi: TIMSS 2015 Türkiye Örneği

    Get PDF
    In the current study, it was aimed to determine student, teacher and school characteristics that predict science achievement of eight grade students in Turkey. In the study, the data of TIMMS 2015 were used and the study group was comprised of a total of 6079 students and 220 teachers from 218 different schools. As the data collection tools, the eighth grade science achievement test used in TIMMS 2015 and the scales administered to students and teachers and reflecting student, teacher and school characteristics were used. Since there was a multi-level data structure where students were nested in schools, the created model was analyzed by using the RE-EM algorithm, which enables multi-level data structures to be analyzed through the classification and regression tree (CART) method. The predicted variable of the model was students’ science achievement scores and the predictor variables were the seventeen student, teacher and school characteristics expressed in the scales. According to the results obtained, it was determined that five of the seventeen predictor variables predicted the students’ science achievement, which are students confident in science, student bullying, teaching limited by student needs, school discipline problems and school emphasis on academic success. It has been observed that students who have students confidence in science, level of bullying they are exposed to, emphasis on academic success in their schools, school discipline problems in their schools are more, and whose teachers stated that they have more teaching limited by student needs, are more successful.Çalışmada, Türkiye’de öğrenim gören sekizinci sınıf öğrencilerinin fen başarılarını yordayan öğrenci, öğretmen ve okul niteliklerinin neler olduğunun belirlenmesi amaçlanmıştır. Araştırmada TIMSS 2015 uygulaması verileri kullanılmış olup, araştırmanın çalışma gurubunu bu uygulamaya katılan 218 okuldan 6079 öğrenci ve 220 öğretmen oluşturmuştur. Veri toplama araçları olarak, TIMSS 2015 uygulamasında kullanılan sekizinci sınıf fen başarı testi; öğrenci ve öğretmenlere uygulanan öğrenci, öğretmen ve okul özelliklerini yansıtan ölçekler kullanılmıştır. Öğrencilerin okullar içinde yuvalandığı çok düzeyli veri yapısı söz konusu olduğundan; oluşturulan model, çok düzeyli veri yapılarının sınıflandırma ve regresyon ağacı (SRA) yöntemiyle çözümlenmesini sağlayan RE-EM algoritması aracılığıyla analiz edilmiştir. Modelin yordanan değişkenini öğrencilerin fen başarı puanları; yordayıcı değişkenlerini ise, belirtilen ölçeklerle ifade edilen öğrenci, öğretmen ve okul özelliklerine ait on yedi nitelik oluşturmuştur. Elde edilen sonuçlara göre, on yedi yordayıcı değişkenden beşinin öğrencilerin fen başarısını yordadığı saptanmış olup, bunlar; öğrencilerin öz-yeterliği, zorbalığa maruz kalma, öğrencilerden kaynaklanan sorunlar, okul disiplini ve güvenliği ve okulda başarıya verilen önem şeklinde sıralanmıştır. Öz-yeterlik inancı, maruz kalınan zorbalık düzeyi, okullarında başarıya verilen önem, okullarındaki disiplin ve güvenlik önlemlerinin yüksek olduğu ve öğretmenlerin öğrenciden kaynaklanan sorunları daha yüksek düzeyde nitelendirdikleri öğrencilerin daha başarılı olduğu gözlenmiştir

    Energy Efficiency in Buildings: Both New and Rehabilitated

    Get PDF
    Buildings are one of the main causes of the emission of greenhouse gases in the world. Europe alone is responsible for more than 30% of emissions, or about 900 million tons of CO2 per year. Heating and air conditioning are the main cause of greenhouse gas emissions in buildings. Most buildings currently in use were built with poor energy efficiency criteria or, depending on the country and the date of construction, none at all. Therefore, regardless of whether construction regulations are becoming stricter, the real challenge nowadays is the energy rehabilitation of existing buildings. It is currently a priority to reduce (or, ideally, eliminate) the waste of energy in buildings and, at the same time, supply the necessary energy through renewable sources. The first can be achieved by improving the architectural design, construction methods, and materials used, as well as the efficiency of the facilities and systems; the second can be achieved through the integration of renewable energy (wind, solar, geothermal, etc.) in buildings. In any case, regardless of whether the energy used is renewable or not, the efficiency must always be taken into account. The most profitable and clean energy is that which is not consumed

    Progress and summary of reinforcement learning on energy management of MPS-EV

    Full text link
    The high emission and low energy efficiency caused by internal combustion engines (ICE) have become unacceptable under environmental regulations and the energy crisis. As a promising alternative solution, multi-power source electric vehicles (MPS-EVs) introduce different clean energy systems to improve powertrain efficiency. The energy management strategy (EMS) is a critical technology for MPS-EVs to maximize efficiency, fuel economy, and range. Reinforcement learning (RL) has become an effective methodology for the development of EMS. RL has received continuous attention and research, but there is still a lack of systematic analysis of the design elements of RL-based EMS. To this end, this paper presents an in-depth analysis of the current research on RL-based EMS (RL-EMS) and summarizes the design elements of RL-based EMS. This paper first summarizes the previous applications of RL in EMS from five aspects: algorithm, perception scheme, decision scheme, reward function, and innovative training method. The contribution of advanced algorithms to the training effect is shown, the perception and control schemes in the literature are analyzed in detail, different reward function settings are classified, and innovative training methods with their roles are elaborated. Finally, by comparing the development routes of RL and RL-EMS, this paper identifies the gap between advanced RL solutions and existing RL-EMS. Finally, this paper suggests potential development directions for implementing advanced artificial intelligence (AI) solutions in EMS
    corecore