738 research outputs found

    A Non-Deterministic Strategy for Searching Optimal Number of Trees Hyperparameter in Random Forest

    Get PDF
    International audienceIn this paper, we present a non-deterministic strategy for searching for optimal number of trees hyperparameter in Random Forest (RF). Hyperparameter tuning in Machine Learning (ML) algorithms is essential. It optimizes predictability of an ML algorithm and/or improves computer resources utilization. However, hyperparameter tuning is a complex optimization task and time consuming. We set up experiments with the goal of maximizing predictability, minimizing number of trees and minimizing time of execution. Compared to the deterministic search algorithm, the non-deterministic search algorithm recorded an average percentage accuracy of approximately 98%, number of trees percentage average improvement of 44.64%, average time of execution mean improvement ratio of 213.25 and an average improvement of 94% iterations. Moreover, evaluations using Jackkife Estimation show stable and reliable results from several experiment runs of the non-deterministic strategy. The non-deterministic approach in searching hyperparameter shows a significant accuracy and better computer resources (i.e cpu and memory time) utilization. This approach can be adopted widely in hyperparameter tuning, and in conserving utilization of computer resources like green computing

    SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization

    Full text link
    Computer vision is experiencing an AI renaissance, in which machine learning models are expediting important breakthroughs in academic research and commercial applications. Effectively training these models, however, is not trivial due in part to hyperparameters: user-configured values that control a model's ability to learn from data. Existing hyperparameter optimization methods are highly parallel but make no effort to balance the search across heterogeneous hardware or to prioritize searching high-impact spaces. In this paper, we introduce a framework for massively Scalable Hardware-Aware Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the relative complexity of each search space and monitors performance on the learning task over all trials. These metrics are then used as heuristics to assign hyperparameters to distributed workers based on their hardware. We first demonstrate that our framework achieves double the throughput of a standard distributed hyperparameter optimization framework by optimizing SVM for MNIST using 150 distributed workers. We then conduct model search with SHADHO over the course of one week using 74 GPUs across two compute clusters to optimize U-Net for a cell segmentation task, discovering 515 models that achieve a lower validation loss than standard U-Net.Comment: 10 pages, 6 figure

    A Data Mining Methodology for Vehicle Crashworthiness Design

    Get PDF
    This study develops a systematic design methodology based on data mining theory for decision-making in the development of crashworthy vehicles. The new data mining methodology allows the exploration of a large crash simulation dataset to discover the underlying relationships among vehicle crash responses and design variables at multiple levels and to derive design rules based on the whole-vehicle safety requirements to make decisions about component-level and subcomponent-level design. The method can resolve a major issue with existing design approaches related to vehicle crashworthiness: that is, limited abilities to explore information from large datasets, which may hamper decision-making in the design processes. At the component level, two structural design approaches were implemented for detailed component design with the data mining method: namely, a dimension-based approach and a node-based approach to handle structures with regular and irregular shapes, respectively. These two approaches were used to design a thin-walled vehicular structure, the S-shaped beam, against crash loading. A large number of design alternatives were created, and their responses under loading were evaluated by finite element simulations. The design variables and computed responses formed a large design dataset. This dataset was then mined to build a decision tree. Based on the decision tree, the interrelationships among the design parameters were revealed, and design rules were generated to produce a set of good designs. After the data mining, the critical design parameters were identified and the design space was reduced, which can simplify the design process. To partially replace the expensive finite element simulations, a surrogate model was used to model the relationships between design variables and response. Four machine learning algorithms, which can be used for surrogate model development, were compared. Based on the results, Gaussian process regression was determined to be the most suitable technique in the present scenario, and an optimization process was developed to tune the algorithm’s hyperparameters, which govern the model structure and training process. To account for engineering uncertainty in the data mining method, a new decision tree for uncertain data was proposed based on the joint probability in uncertain spaces, and it was implemented to again design the S-beam structure. The findings show that the new decision tree can produce effective decision-making rules for engineering design under uncertainty. To evaluate the new approaches developed in this work, a comprehensive case study was conducted by designing a vehicle system against the frontal crash. A publicly available vehicle model was simplified and validated. Using the newly developed approaches, new component designs in this vehicle were generated and integrated back into the vehicle model so their crash behavior could be simulated. Based on the simulation results, one can conclude that the designs with the new method can outperform the original design in terms of measures of mass, intrusion and peak acceleration. Therefore, the performance of the new design methodology has been confirmed. The current study demonstrates that the new data mining method can be used in vehicle crashworthiness design, and it has the potential to be applied to other complex engineering systems with a large amount of design data

    Agnostic Bayes

    Get PDF
    Tableau d'honneur de la Faculté des études supérieures et postdorales, 2014-2015L’apprentissage automatique correspond à la science de l’apprentissage à partir d’exemples. Des algorithmes basés sur cette approche sont aujourd’hui omniprésents. Bien qu’il y ait eu un progrès significatif, ce domaine présente des défis importants. Par exemple, simplement sélectionner la fonction qui correspond le mieux aux données observées n’offre aucune garantie statistiques sur les exemples qui n’ont pas encore été observées. Quelques théories sur l’apprentissage automatique offrent des façons d’aborder ce problème. Parmi ceux-ci, nous présentons la modélisation bayésienne de l’apprentissage automatique et l’approche PACbayésienne pour l’apprentissage automatique dans une vue unifiée pour mettre en évidence d’importantes similarités. Le résultat de cette analyse suggère que de considérer les réponses de l’ensemble des modèles plutôt qu’un seul correspond à un des éléments-clés pour obtenir une bonne performance de généralisation. Malheureusement, cette approche vient avec un coût de calcul élevé, et trouver de bonnes approximations est un sujet de recherche actif. Dans cette thèse, nous présentons une approche novatrice qui peut être appliquée avec un faible coût de calcul sur un large éventail de configurations d’apprentissage automatique. Pour atteindre cet objectif, nous appliquons la théorie de Bayes d’une manière différente de ce qui est conventionnellement fait pour l’apprentissage automatique. Spécifiquement, au lieu de chercher le vrai modèle à l’origine des données observées, nous cherchons le meilleur modèle selon une métrique donnée. Même si cette différence semble subtile, dans cette approche, nous ne faisons pas la supposition que le vrai modèle appartient à l’ensemble de modèles explorés. Par conséquent, nous disons que nous sommes agnostiques. Plusieurs expérimentations montrent un gain de généralisation significatif en utilisant cette approche d’ensemble de modèles durant la phase de validation croisée. De plus, cet algorithme est simple à programmer et n’ajoute pas un coût de calcul significatif à la recherche d’hyperparamètres conventionnels. Finalement, cet outil probabiliste peut également être utilisé comme un test statistique pour évaluer la qualité des algorithmes sur plusieurs ensembles de données d’apprentissage.Machine learning is the science of learning from examples. Algorithms based on this approach are now ubiquitous. While there has been significant progress, this field presents important challenges. Namely, simply selecting the function that best fits the observed data was shown to have no statistical guarantee on the examples that have not yet been observed. There are a few learning theories that suggest how to address this problem. Among these, we present the Bayesian modeling of machine learning and the PAC-Bayesian approach to machine learning in a unified view to highlight important similarities. The outcome of this analysis suggests that model averaging is one of the key elements to obtain a good generalization performance. Specifically, one should perform predictions based on the outcome of every model instead of simply the one that best fits the observed data. Unfortunately, this approach comes with a high computational cost problem, and finding good approximations is the subject of active research. In this thesis, we present an innovative approach that can be applied with a low computational cost on a wide range of machine learning setups. In order to achieve this, we apply the Bayes’ theory in a different way than what is conventionally done for machine learning. Specifically, instead of searching for the true model at the origin of the observed data, we search for the best model according to a given metric. While the difference seems subtle, in this approach, we do not assume that the true model belongs to the set of explored model. Hence, we say that we are agnostic. An extensive experimental setup shows a significant generalization performance gain when using this model averaging approach during the cross-validation phase. Moreover, this simple algorithm does not add a significant computational cost to the conventional search of hyperparameters. Finally, this probabilistic tool can also be used as a statistical significance test to evaluate the quality of learning algorithms on multiple datasets

    Automatic Gradient Boosting

    Get PDF
    • …
    corecore