25,841 research outputs found

    Multiscale Discriminant Saliency for Visual Attention

    Full text link
    The bottom-up saliency, an early stage of humans' visual attention, can be considered as a binary classification problem between center and surround classes. Discriminant power of features for the classification is measured as mutual information between features and two classes distribution. The estimated discrepancy of two feature classes very much depends on considered scale levels; then, multi-scale structure and discriminant power are integrated by employing discrete wavelet features and Hidden markov tree (HMT). With wavelet coefficients and Hidden Markov Tree parameters, quad-tree like label structures are constructed and utilized in maximum a posterior probability (MAP) of hidden class variables at corresponding dyadic sub-squares. Then, saliency value for each dyadic square at each scale level is computed with discriminant power principle and the MAP. Finally, across multiple scales is integrated the final saliency map by an information maximization rule. Both standard quantitative tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating the proposed multiscale discriminant saliency method (MDIS) against the well-know information-based saliency method AIM on its Bruce Database wity eye-tracking data. Simulation results are presented and analyzed to verify the validity of MDIS as well as point out its disadvantages for further research direction.Comment: 16 pages, ICCSA 2013 - BIOCA sessio

    Integral projection models for species with complex demography

    Get PDF
    Matrix projection models occupy a central role in population and conservation biology. Matrix models divide a population into discrete classes, even if the structuring trait exhibits continuous variation ( e. g., body size). The integral projection model ( IPM) avoids discrete classes and potential artifacts from arbitrary class divisions, facilitates parsimonious modeling based on smooth relationships between individual state and demographic performance, and can be implemented with standard matrix software. Here, we extend the IPM to species with complex demographic attributes, including dormant and active life stages, cross- classification by several attributes ( e. g., size, age, and condition), and changes between discrete and continuous structure over the life cycle. We present a general model encompassing these cases, numerical methods, and theoretical results, including stable population growth and sensitivity/ elasticity analysis for density- independent models, local stability analysis in density- dependent models, and optimal/ evolutionarily stable strategy life- history analysis. Our presentation centers on an IPM for the thistle Onopordum illyricum based on a 6- year field study. Flowering and death probabilities are size and age dependent, and individuals also vary in a latent attribute affecting survival, but a predictively accurate IPM is completely parameterized by fitting a few regression equations. The online edition of the American Naturalist includes a zip archive of R scripts illustrating our suggested methods

    Unified functional network and nonlinear time series analysis for complex systems science: The pyunicorn package

    Get PDF
    We introduce the \texttt{pyunicorn} (Pythonic unified complex network and recurrence analysis toolbox) open source software package for applying and combining modern methods of data analysis and modeling from complex network theory and nonlinear time series analysis. \texttt{pyunicorn} is a fully object-oriented and easily parallelizable package written in the language Python. It allows for the construction of functional networks such as climate networks in climatology or functional brain networks in neuroscience representing the structure of statistical interrelationships in large data sets of time series and, subsequently, investigating this structure using advanced methods of complex network theory such as measures and models for spatial networks, networks of interacting networks, node-weighted statistics or network surrogates. Additionally, \texttt{pyunicorn} provides insights into the nonlinear dynamics of complex systems as recorded in uni- and multivariate time series from a non-traditional perspective by means of recurrence quantification analysis (RQA), recurrence networks, visibility graphs and construction of surrogate time series. The range of possible applications of the library is outlined, drawing on several examples mainly from the field of climatology.Comment: 28 pages, 17 figure

    Statistical Active Learning Algorithms for Noise Tolerance and Differential Privacy

    Full text link
    We describe a framework for designing efficient active learning algorithms that are tolerant to random classification noise and are differentially-private. The framework is based on active learning algorithms that are statistical in the sense that they rely on estimates of expectations of functions of filtered random examples. It builds on the powerful statistical query framework of Kearns (1993). We show that any efficient active statistical learning algorithm can be automatically converted to an efficient active learning algorithm which is tolerant to random classification noise as well as other forms of "uncorrelated" noise. The complexity of the resulting algorithms has information-theoretically optimal quadratic dependence on 1/(12η)1/(1-2\eta), where η\eta is the noise rate. We show that commonly studied concept classes including thresholds, rectangles, and linear separators can be efficiently actively learned in our framework. These results combined with our generic conversion lead to the first computationally-efficient algorithms for actively learning some of these concept classes in the presence of random classification noise that provide exponential improvement in the dependence on the error ϵ\epsilon over their passive counterparts. In addition, we show that our algorithms can be automatically converted to efficient active differentially-private algorithms. This leads to the first differentially-private active learning algorithms with exponential label savings over the passive case.Comment: Extended abstract appears in NIPS 201

    Limit laws for random vectors with an extreme component

    Full text link
    Models based on assumptions of multivariate regular variation and hidden regular variation provide ways to describe a broad range of extremal dependence structures when marginal distributions are heavy tailed. Multivariate regular variation provides a rich description of extremal dependence in the case of asymptotic dependence, but fails to distinguish between exact independence and asymptotic independence. Hidden regular variation addresses this problem by requiring components of the random vector to be simultaneously large but on a smaller scale than the scale for the marginal distributions. In doing so, hidden regular variation typically restricts attention to that part of the probability space where all variables are simultaneously large. However, since under asymptotic independence the largest values do not occur in the same observation, the region where variables are simultaneously large may not be of primary interest. A different philosophy was offered in the paper of Heffernan and Tawn [J. R. Stat. Soc. Ser. B Stat. Methodol. 66 (2004) 497--546] which allows examination of distributional tails other than the joint tail. This approach used an asymptotic argument which conditions on one component of the random vector and finds the limiting conditional distribution of the remaining components as the conditioning variable becomes large. In this paper, we provide a thorough mathematical examination of the limiting arguments building on the orientation of Heffernan and Tawn [J. R. Stat. Soc. Ser. B Stat. Methodol. 66 (2004) 497--546]. We examine the conditions required for the assumptions made by the conditioning approach to hold, and highlight simililarities and differences between the new and established methods.Comment: Published at http://dx.doi.org/10.1214/105051606000000835 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Optimazation of marine sediments characterization via statistical analysis

    Get PDF
    The task of geotechnical site characterization includes defining the layout of ground units and establishing their relevant engineering properties. This is an activity in which uncertainties of different nature (inherent, experimental, of interpretation…) are always present and in which the amount and characteristics of available data are highly variable. Probabilistic methodologies are applied to assess and manage uncertainties. A Bayesian perspective of probability, that roots probability on belief, is well suited for geotechnical characterization problems, as it has flexibility to handle different kind of uncertainties and highly variable datasets –in quality and quantity. This thesis addresses different topics of geotechnical site characterization from a probabilistic perspective, with emphasis on offshore investigation, on the Cone Penetration Test (CPTu) and on Bayesian methodologies.The first topic addresses soil layer delineation based on CPT(u) data. The starting point is the recognition that layer delineation is problem-oriented and has a strong subjective component. We propose a novel CPTu record analysis methodology which aims to: a) elicit the heuristics that intervene in layer delineation, facilitating communication and coherence in interpretation b) facilitate probabilistic characterization of the identified layers c) is simple and intuitive to use. The method is based on sequential distribution fitting in conventionally accepted classification spaces (Soil Behavior Type charts). The proposed technique is applied at different sites, illustrating how it can be related to borehole observations, how it compares with alternative methodologies and how it can be extended to create cross-site profiles. The second topic addresses strain-rate corrections of dynamic CPTu data. Dynamic CPTu impact on the seafloor and are very agile characterization tools. However, they require transformation to equivalent quasi-static results that can be conventionally interpreted. Up to now the necessary corrections are either too vague or require the acquisition of paired dynamic and quasi-static CPTu records (i.e., same location’s acquisition). A Bayesian methodology is applied to derive strain-rate coefficients in a more general setting, one in which some quasi-static CPTu records are available in the study area, but they need not be paired to any converted dynamic CPTu. Application to a case study offshore Nice shows that the results match those obtained using paired tests. Furthermore, strain rate correction coefficients and transformed quasi-static profiles are expressed in probabilistic terms.The third topic addressed is the optimization of soil unit weight prediction from CPTu readings. A Bayesian Mixture Analysis is applied to a global database to identify hidden soil classes within it. The goal is to improve the accuracy of regressions between geotechnical parameters obtained by exploiting the database. The method is applied to predict soil unit weight from CPTu data, a problem that has intrinsic practical interest but it is also representative of difficulties faced by a larger class of problems in geotechnical regression. Results highlight a decrease of systematic transformation uncertainty and an improve of accuracy of soil unit weight prediction from CPTu at new sites. In a final application we present a probabilistic earthquake-induced landslide susceptibility map of the South-West Iberian margin. A simplified geotechnical pixel-based slope stability model is considered to whole study area within which the key stability model parameters are treated as random variables. Site characterization at the regional scale combines a global database with available geotechnical data through a Bayesian scheme. Outputs (landslide susceptibility maps) are derived from a reliability-based design procedure (Montecarlo simulations) providing a robust landslide susceptibility prediction at the site according to Receiver Operating Curve (ROC).La caracterización geotécnica de un emplazamiento incluye la definición de la disposición de las unidades de suelo y el establecimiento de sus propiedades de ingeniería relevantes. Es una actividad en la que siempre están presentes incertidumbres y en la que la cantidad y las caracteristicas de los datos disponibles son muy variables. Para evaluar y gestionar las incertidumbres se aplican metodologías probabilísticas. Una perspectiva bayesiana de la probabilidad es muy adecuada para la caracterización geotécnica, ya que tiene flexibilidad para manejar incertidumbres y datos muy variables. Esta tesis aborda diferentes temas de caracterización geotécnica desde una perspectiva probabilística, con énfasis en la investigación en alta mar, en el ensayo de penetración de cono (CPTu) y en las metodologías bayesianas El primer tema aborda la delineación de la capa de suelo basada en los datos CPT(u). El punto de partida es el reconocimiento de que la delineación de capas tiene un fuerte componente subjetivo. Proponemos una novedosa metodología de análisis de registros CPTu que tiene como objetivo: a) expresar la heurística que interviene en la delineación de capas, facilitando la comunicación en la su interpretación b) facilitar la caracterización probabilística de las capas identificadas c) uso sencillo e intuitivo. El método se basa en el ajuste de distribuciones secuenciales en espacios de clasificación (tablas de comportamiento del suelo). La técnica propuesta se aplica en diferentes emplazamientos, ilustrando cómo puede relacionarse con sondeos, cómo se compara con metodologías alternativas y cómo puede ampliarse para crear perfiles entre emplazamientos. El segundo tema aborda las correcciones de la velocidad de deformación de los datos del CPTu dinámico (que impactan en el fondo marino y son herramientas de caracterización muy ágiles). Sin embargo, requieren una transformación a resultados equivalentes que puedan ser interpretados convencionalmente. Hasta ahora las correcciones necesarias son vagas o requieren la adquisición de CPTu dinámicos y cuasi-estáticos emparejados. Se aplica una metodologia bayesiana para derivar los coeficientes de velocidad de deformación en un entorno más general, en el que se dispone de algunos registros de CPTu cuasi­estáticos en la zona de estudio, pero no es necesario emparejarlos con ningún CPTu dinámico convertido. La aplicación a un estudio de caso en el mar de Niza muestra que los resultados coinciden con los obtenidos mediante pruebas emparejadas. El tercer tema abordado es la optimización de la predicción del peso unitario del suelo a partir de las lecturas del CPTu. Se aplica un análisis de mezclas bayesiano a una base de datos global para identificar las clases de suelo ocultas en ella. El objetivo es mejorar la precisión de las regresiones entre los parámetros geotécnicos obtenidos explotando la base de datos. El método se aplica a la predicción del peso unitario del suelo a partir de los datos del CPTu. Los resultados destacan una disminución de la incertidumbre sistemática de la transformación y una mejora de la precisión de la predicción del peso unitario del suelo a partir de CPTu en nuevos sitios. En una aplicación final presentamos un mapa probabilistico de susceptibilidad a los deslizamientos de tierra inducidos por terremotos en el margen suroeste de la Península Ibérica. Se considera un modelo geotécnico simplificado de estabilidad de laderas basado en píxeles para toda el área de estudio, dentro del cual los parámetros clave del modelo de estabilidad se tratan como variables aleatorias. La caracterización a escala regional combina una base de datos global con los datos geotécnicos disponibles mediante un esquema bayesiano. Mapas de susceptibilidad a los corrimientos de tierra se derivan de un procedimiento de diseño basado en la fiabilidad que proporciona una predicción robusta de la susceptibilidad a deslizamientos de tierra en el sitio de acuerdo con la curva operativa del receptor (ROC).Postprint (published version

    The use of data-mining for the automatic formation of tactics

    Get PDF
    This paper discusses the usse of data-mining for the automatic formation of tactics. It was presented at the Workshop on Computer-Supported Mathematical Theory Development held at IJCAR in 2004. The aim of this project is to evaluate the applicability of data-mining techniques to the automatic formation of tactics from large corpuses of proofs. We data-mine information from large proof corpuses to find commonly occurring patterns. These patterns are then evolved into tactics using genetic programming techniques
    corecore