280,690 research outputs found

    Component Selection for the Metro Visualisation of the Self-Organising Map

    Get PDF
    Self-Organising Maps have been used for a wide range of clustering applications. They are well-suited for various visualisation techniques to offer better insight into the clustered data sets. A particularly feasible visualisation is the plotting of single components of a data set and their distribution across the SOM. One central problem of the visualisation of Component Planes is that a single plot is needed for each component; this understandably leads to problems with higher-dimensional data. We therefore build on the Metro Visualisation for Self-Organising Maps which integrates the idea of Component Planes into one illustration. Higher-dimensional data sets still pose problems in terms of overloaded visualisations - component selection and aggregation techniques are highly desirable. We therefore propose and compare two methods, one for the aggregation of correlated components, one for the selection of the components most feasible for visualisation for a given clustering

    Predicting tree distributions in an East African biodiversity hotspot : model selection, data bias and envelope uncertainty

    Get PDF
    The Eastern Arc Mountains (EAMs) of Tanzania and Kenya support some of the most ancient tropical rainforest on Earth. The forests are a global priority for biodiversity conservation and provide vital resources to the Tanzanian population. Here, we make a first attempt to predict the spatial distribution of 40 EAM tree species, using generalised additive models, plot data and environmental predictor maps at sub 1 km resolution. The results of three modelling experiments are presented, investigating predictions obtained by (1) two different procedures for the stepwise selection of predictors, (2) down-weighting absence data, and (3) incorporating an autocovariate term to describe fine-scale spatial aggregation. In response to recent concerns regarding the extrapolation of model predictions beyond the restricted environmental range of training data, we also demonstrate a novel graphical tool for quantifying envelope uncertainty in restricted range niche-based models (envelope uncertainty maps). We find that even for species with very few documented occurrences useful estimates of distribution can be achieved. Initiating selection with a null model is found to be useful for explanatory purposes, while beginning with a full predictor set can over-fit the data. We show that a simple multimodel average of these two best-model predictions yields a superior compromise between generality and precision (parsimony). Down-weighting absences shifts the balance of errors in favour of higher sensitivity, reducing the number of serious mistakes (i.e., falsely predicted absences); however, response functions are more complex, exacerbating uncertainty in larger models. Spatial autocovariates help describe fine-scale patterns of occurrence and significantly improve explained deviance, though if important environmental constraints are omitted then model stability and explanatory power can be compromised. We conclude that the best modelling practice is contingent both on the intentions of the analyst (explanation or prediction) and on the quality of distribution data; generalised additive models have potential to provide valuable information for conservation in the EAMs, but methods must be carefully considered, particularly if occurrence data are scarce. Full results and details of all species models are supplied in an online Appendix. (C) 2008 Elsevier B.V. All rights reserved

    Weighted Heuristic Ensemble of Filters

    Get PDF
    Feature selection has become increasingly important in data mining in recent years due to the rapid increase in the dimensionality of big data. However, the reliability and consistency of feature selection methods (filters) vary considerably on different data and no single filter performs consistently well under various conditions. Therefore, feature selection ensemble has been investigated recently to provide more reliable and effective results than any individual one but all the existing feature selection ensemble treat the feature selection methods equally regardless of their performance. In this paper, we present a novel framework which applies weighted feature selection ensemble through proposing a systemic way of adding different weights to the feature selection methods-filters. Also, we investigate how to determine the appropriate weight for each filter in an ensemble. Experiments based on ten benchmark datasets show that theoretically and intuitively adding more weight to ‘good filters’ should lead to better results but in reality it is very uncertain. This assumption was found to be correct for some examples in our experiment. However, for other situations, filters which had been assumed to perform well showed bad performance leading to even worse results. Therefore adding weight to filters might not achieve much in accuracy terms, in addition to increasing complexity, time consumption and clearly decreasing the stability

    The Influence of Signaling Conspecific and Heterospecific Neighbors on Eavesdropper Pressure

    Get PDF
    The study of tradeoffs between the attraction of mates and the attraction of eavesdropping predators and parasites has generally focused on a single species of prey, signaling in isolation. In nature, however, animals often signal from mixed-species aggregations, where interactions with heterospecific group members may be an important mechanism modulating tradeoffs between sexual and natural selection, and thus driving signal evolution. Although studies have shown that conspecific signalers can influence eavesdropper pressure on mating signals, the effects of signaling heterospecifics on eavesdropper pressure, and on the balance between natural and sexual selection, are likely to be different. Here, we review the role of neighboring signalers in mediating changes in eavesdropper pressure, and present a simple model that explores how selection imposed by eavesdropping enemies varies as a function of a signaling aggregation\u27s species composition, the attractiveness of aggregation members to eavesdroppers, and the eavesdroppers\u27 preferences for different member types. This approach can be used to model mixed-species signaling aggregations, as well as same-species aggregations, including those with non-signaling individuals, such as satellites or females. We discuss the implications of our model for the evolution of signal structure, signaling behavior, mixed-species aggregations, and community dynamics

    Hierarchical Knowledge-Gradient for Sequential Sampling

    Get PDF
    We consider the problem of selecting the best of a finite but very large set of alternatives. Each alternative may be characterized by a multi-dimensional vector and has independent normal rewards. This problem arises in various settings such as (i) ranking and selection, (ii) simulation optimization where the unknown mean of each alternative is estimated with stochastic simulation output, and (iii) approximate dynamic programming where we need to estimate values based on Monte-Carlo simulation. We use a Bayesian probability model for the unknown reward of each alternative and follow a fully sequential sampling policy called the knowledge-gradient policy. This policy myopically optimizes the expected increment in the value of sampling information in each time period. Because the number of alternatives is large, we propose a hierarchical aggregation technique that uses the common features shared by alternatives to learn about many alternatives from even a single measurement, thus greatly reducing the measurement effort required. We demonstrate how this hierarchical knowledge-gradient policy can be applied to efficiently maximize a continuous function and prove that this policy finds a globally optimal alternative in the limit
    • 

    corecore