72 research outputs found

    Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package

    Get PDF
    In this article, we introduce the BART R package which is an acronym for Bayesian additive regression trees. BART is a Bayesian nonparametric, machine learning, ensemble predictive modeling method for continuous, binary, categorical and time-to-event outcomes. Furthermore, BART is a tree-based, black-box method which fits the outcome to an arbitrary random function, f , of the covariates. The BART technique is relatively computationally efficient as compared to its competitors, but large sample sizes can be demanding. Therefore, the BART package includes efficient state-of-the-art implementations for continuous, binary, categorical and time-to-event outcomes that can take advantage of modern off-the-shelf hardware and software multi-threading technology. The BART package is written in C++ for both programmer and execution efficiency. The BART package takes advantage of multi-threading via forking as provided by the parallel package and OpenMP when available and supported by the platform. The ensemble of binary trees produced by a BART fit can be stored and re-used later via the R predict function. In addition to being an R package, the installed BART routines can be called directly from C++. The BART package provides the tools for your BART toolbox

    Recent advances in the theory and practice of logical analysis of data

    Get PDF
    Logical Analysis of Data (LAD) is a data analysis methodology introduced by Peter L. Hammer in 1986. LAD distinguishes itself from other classification and machine learning methods by the fact that it analyzes a significant subset of combinations of variables to describe the positive or negative nature of an observation and uses combinatorial techniques to extract models defined in terms of patterns. In recent years, the methodology has tremendously advanced through numerous theoretical developments and practical applications. In the present paper, we review the methodology and its recent advances, describe novel applications in engineering, finance, health care, and algorithmic techniques for some stochastic optimization problems, and provide a comparative description of LAD with well-known classification methods

    Minimum hub cover problem: solution methods and applications

    Get PDF
    The minimum hub cover is a new NP-hard optimization problem that has been recently introduced to the literature in the context of graph query processing on graph databases. The problem has been introduced as a new graph representation model to expedite graph queries. With this representation, a graph database can be represented by a small subset of graph vertices. Searching over only that subset of vertices decreases the response time of a query and increases the efficiency of graph query processing. We introduce the problem of finding a subgraph including the minimum number of vertices as an optimization problem referred to as the minimum hub cover problem. We demonstrate that searching a query over the vertices in minimum hub cover increases the efficiency of query processing and surpasses the existing search methods. We also introduce several mathematical programming models. In particular, we give two binary integer programming formulations as well as a novel quadratic integer programming formulation. We use the linear programming relaxations of the binary integer programming models. Our relaxation for the quadratic integer programming model leads to a semidefinite programming formulation. We also present several rounding heuristics to obtain integral solutions after solving the proposed relaxations. We also focus on planar graphs which have many applications in planar graph query processing and devise fast heuristics with good solution quality for minimum hub cover. We also study an approximation algorithm with a performance guarantee to solve the minimum hub cover problem on planar graphs. We conduct several numerical studies to analyze the empirical performances of solution methods proposed in this thesis

    LWA 2013. Lernen, Wissen & Adaptivität ; Workshop Proceedings Bamberg, 7.-9. October 2013

    Get PDF
    LWA Workshop Proceedings: LWA stands for "Lernen, Wissen, Adaption" (Learning, Knowledge, Adaptation). It is the joint forum of four special interest groups of the German Computer Science Society (GI). Following the tradition of the last years, LWA provides a joint forum for experienced and for young researchers, to bring insights to recent trends, technologies and applications, and to promote interaction among the SIGs

    High-performance evolutionary computation for scalable spatial optimization

    Get PDF
    Spatial optimization (SO) is an important and prolific field of interdisciplinary research. Spatial optimization methods seek optimal allocation or arrangement of spatial units under spatial constraints such as distance, adjacency, contiguity, partition, etc. As spatial granularity becomes finer and problem formulations incorporate increasingly complex compositions of spatial information, the performance of spatial optimization solvers becomes more imperative. My research focuses on scalable spatial optimization methods within the evolutionary algorithm (EA) framework. The computational scalability challenge in EA is addressed by developing a parallel EA library that eliminates the costly global synchronization in massively parallel computing environment and scales to 131,072 processors. Classic EA operators are based on linear recombination and experience serious problems in traversing the decision space with non-linear spatial configurations. I propose a spatially explicit EA framework that couples graph representations of spatial constraints with intelligent guided search heuristics such as path relinking and ejection chain to effectively explore SO decision space. As a result, novel spatial recombination operators are developed to handle strong spatial constraints effectively and are generic to incorporate problem-specific spatial characteristics. This framework is employed to solve large political redistricting problems. Voting district-level redistricting problems are solved and sampled to create billions of feasible districting plans that adhere to Supreme Court mandates, suitable for statistical analyses of redistricting phenomena such as gerrymandering

    Organisational Design & Mirroring in Construction

    Get PDF
    The mirroring hypothesis posits that an intrinsic connection exists between the architecture of a product and that of the organisation which produces it, which can influence operational efficiency. The mirroring hypothesis is applicable to construction wherein organisational design is concerned with the establishment of governance frameworks for the procurement of projects and product design is that of buildings and engineering structures. This thesis investigates the hypothesis that design data architecture mirrors component architecture in a construction project. A general procedure has emerged to investigate the mirroring hypothesis, consisting of three steps: the capturing of product architecture, the capturing of organisational architecture, and comparison of the two. The subject project is a completed building. The capturing of architecture is achieved by modelling functional dependency between components in the form of a node-link network structure. It was found that the subject project did not exhibit a high degree of visible or otherwise mirroring, hence the hypothesis is concluded to be false in this case. An explanation is that two architectures within one have been identified in the model. This makes senses because design data is structured into packages associated with design disciplines which are associated with sub-systems, which in turn corresponds to design team structure. On the other hand, the components model was prepared principally on the basis of physical connectivity. The result implies for organisational design in construction that the design management role should either be carried out by the architect for mirroring alignment, or, to mitigate misalignment, by a third party with design background as opposed to a construction background

    An adaptive hybrid genetic-annealing approach for solving the map problem on belief networks

    Get PDF
    Genetic algorithms (GAs) and simulated annealing (SA) are two important search methods that have been used successfully in solving difficult problems such as combinatorial optimization problems. Genetic algorithms are capable of wide exploration of the search space, while simulated annealing is capable of fine tuning a good solution. Combining both techniques may result in achieving the benefits of both and improving the quality of the solutions obtained. Several attempts have been made to hybridize GAs and SA. One such attempt was to augment a standard GA with simulated annealing as a genetic operator. SA in that case acted as a directed or intelligent mutation operator as opposed to the random, undirected mutation operator of GAs. Although using this technique showed some advantages over GA used alone, one problem was to find fixed global annealing parameters that work for all solutions and all stages in the search process. Failing to find optimum annealing parameters affects the quality of the solution obtained and may degrade performance. In this research, we try to overcome this weakness by introducing an adaptive hybrid GA - SA algorithm, in which simulated annealing acts as a special case of mutation. However, the annealing operator used in this technique is adaptive in the sense that the annealing parameters are evolved and optimized according to the requirements of the search process. Adaptation is expected to help guide the search towards optimum solutions with minimum effort of parameter optimization. The algorithm is tested in solving an important NP-hard problem, which is the MAP (Maximum a-Posteriori) assignment problem on BBNs (Bayesian Belief Networks). The algorithm is also augmented with some problem specific information used to design a new GA crossover operator. The results obtained from testing the algorithm on several BBN graphs with large numbers of nodes and different network structures indicate that the adaptive hybrid algorithm provides an improvement of solution quality over that obtained by GA used alone and GA augmented with standard non-adaptive simulated annealing. Its effect, however, is more profound for problems with large numbers of nodes, which are difficult for GA alone to solve
    corecore