131,910 research outputs found

    On the scatter in the relation between stellar mass and halo mass: random or halo formation time dependent?

    Full text link
    The empirical HOD model of Wang et al. 2006 fits, by construction, both the stellar mass function and correlation function of galaxies in the local Universe. In contrast, the semi-analytical models of De Lucia & Blazoit 2007 (DLB07) and Guo et al. 2011 (Guo11), built on the same dark matter halo merger trees than the empirical model, still have difficulties in reproducing these observational data simultaneously. We compare the relations between the stellar mass of galaxies and their host halo mass in the three models, and find that they are different. When the relations are rescaled to have the same median values and the same scatter as in Wang et al., the rescaled DLB07 model can fit both the measured galaxy stellar mass function and the correlation function measured in different galaxy stellar mass bins. In contrast, the rescaled Guo11 model still over-predicts the clustering of low-mass galaxies. This indicates that the detail of how galaxies populate the scatter in the stellar mass -- halo mass relation does play an important role in determining the correlation functions of galaxies. While the stellar mass of galaxies in the Wang et al. model depends only on halo mass and is randomly distributed within the scatter, galaxy stellar mass depends also on the halo formation time in semi-analytical models. At fixed value of infall mass, galaxies that lie above the median stellar mass -- halo mass relation reside in haloes that formed earlier, while galaxies that lie below the median relation reside in haloes that formed later. This effect is much stronger in Guo11 than in DLB07, which explains the over-clustering of low mass galaxies in Guo11. Our results illustrate that the assumption of random scatter in the relation between stellar and halo mass as employed by current HOD and abundance matching models may be problematic in case a significant assembly bias exists in the real Universe.Comment: 10 pages, 6 figures, published in MNRA

    Computing medians and means in Hadamard spaces

    Full text link
    The geometric median as well as the Frechet mean of points in an Hadamard space are important in both theory and applications. Surprisingly, no algorithms for their computation are hitherto known. To address this issue, we use a split version of the proximal point algorithm for minimizing a sum of convex functions and prove that this algorithm produces a sequence converging to a minimizer of the objective function, which extends a recent result of D. Bertsekas (2001) into Hadamard spaces. The method is quite robust and not only does it yield algorithms for the median and the mean, but it also applies to various other optimization problems. We moreover show that another algorithm for computing the Frechet mean can be derived from the law of large numbers due to K.-T. Sturm (2002). In applications, computing medians and means is probably most needed in tree space, which is an instance of an Hadamard space, invented by Billera, Holmes, and Vogtmann (2001) as a tool for averaging phylogenetic trees. It turns out, however, that it can be also used to model numerous other tree-like structures. Since there now exists a polynomial-time algorithm for computing geodesics in tree space due to M. Owen and S. Provan (2011), we obtain efficient algorithms for computing medians and means, which can be directly used in practice.Comment: Corrected version. Accepted in SIAM Journal on Optimizatio

    Pruning of genetic programming trees using permutation tests

    Get PDF
    We present a novel approach based on statistical permutation tests for pruning redundant subtrees from genetic programming (GP) trees that allows us to explore the extent of effective redundancy . We observe that over a range of regression problems, median tree sizes are reduced by around 20% largely independent of test function, and that while some large subtrees are removed, the median pruned subtree comprises just three nodes; most take the form of an exact algebraic simplification. Our statistically-based pruning technique has allowed us to explore the hypothesis that a given subtree can be replaced with a constant if this substitution results in no statistical change to the behavior of the parent tree – what we term approximate simplification. In the eventuality, we infer that more than 95% of the accepted pruning proposals are the result of algebraic simplifications, which provides some practical insight into the scope of removing redundancies in GP trees

    Unreliable point facility location problems on networks

    Get PDF
    In this paper we study facility location problems on graphs under the most common criteria, such as, median, center and centdian, but we incorporate in the objective function some reliability aspects. Assuming that facilities may become unavailable with a certain probability, the problem consists of locating facilities minimizing the overall or the maximum expected service cost in the long run, or a convex combination of the two. We show that the k-facility problem on general networks is NP-hard. Then, we provide efficient algorithms for these problems for the cases of k = 1, 2, both on general networks and on trees. We also explain how our methodology extends to handle a more general class of unreliable point facility location problems related to the ordered median objective function.Ministerio de Ciencia y TecnologíaJunta de Andalucí

    Sequential Design for Computer Experiments with a Flexible Bayesian Additive Model

    Full text link
    In computer experiments, a mathematical model implemented on a computer is used to represent complex physical phenomena. These models, known as computer simulators, enable experimental study of a virtual representation of the complex phenomena. Simulators can be thought of as complex functions that take many inputs and provide an output. Often these simulators are themselves expensive to compute, and may be approximated by "surrogate models" such as statistical regression models. In this paper we consider a new kind of surrogate model, a Bayesian ensemble of trees (Chipman et al. 2010), with the specific goal of learning enough about the simulator that a particular feature of the simulator can be estimated. We focus on identifying the simulator's global minimum. Utilizing the Bayesian version of the Expected Improvement criterion (Jones et al. 1998), we show that this ensemble is particularly effective when the simulator is ill-behaved, exhibiting nonstationarity or abrupt changes in the response. A number of illustrations of the approach are given, including a tidal power application.Comment: 21 page
    corecore