180 research outputs found

    A New Estimator of Intrinsic Dimension Based on the Multipoint Morisita Index

    Full text link
    The size of datasets has been increasing rapidly both in terms of number of variables and number of events. As a result, the empty space phenomenon and the curse of dimensionality complicate the extraction of useful information. But, in general, data lie on non-linear manifolds of much lower dimension than that of the spaces in which they are embedded. In many pattern recognition tasks, learning these manifolds is a key issue and it requires the knowledge of their true intrinsic dimension. This paper introduces a new estimator of intrinsic dimension based on the multipoint Morisita index. It is applied to both synthetic and real datasets of varying complexities and comparisons with other existing estimators are carried out. The proposed estimator turns out to be fairly robust to sample size and noise, unaffected by edge effects, able to handle large datasets and computationally efficient

    Geocomputational approaches for the analysis of Next-Generation Sequencing (NGS) and multi-scale data in landscape genomics

    Get PDF
    The application of geocomputation to the field of landscape genomics (Manel et al. 2010) permits to carry out demanding computational tasks that recently emerged because of the advent of large Next-Generation Sequencing data. When investigating the genetic mechanisms of evolution in spatially distributed plants or animals, geocomputation also proves to be useful to process many association models (gene x environment) in a multi-scale context

    Spatial Data Mining Analytical Environment for Large Scale Geospatial Data

    Get PDF
    Nowadays, many applications are continuously generating large-scale geospatial data. Vehicle GPS tracking data, aerial surveillance drones, LiDAR (Light Detection and Ranging), world-wide spatial networks, and high resolution optical or Synthetic Aperture Radar imagery data all generate a huge amount of geospatial data. However, as data collection increases our ability to process this large-scale geospatial data in a flexible fashion is still limited. We propose a framework for processing and analyzing large-scale geospatial and environmental data using a “Big Data” infrastructure. Existing Big Data solutions do not include a specific mechanism to analyze large-scale geospatial data. In this work, we extend HBase with Spatial Index(R-Tree) and HDFS to support geospatial data and demonstrate its analytical use with some common geospatial data types and data mining technology provided by the R language. The resulting framework has a robust capability to analyze large-scale geospatial data using spatial data mining and making its outputs available to end users

    Complex scaling behavior in animal foraging patterns

    Get PDF
    This dissertation attempts to answer questions from two different areas of biology, ecology and neuroscience, using physics-based techniques. In Section 2, suitability of three competing random walk models is tested to describe the emergent movement patterns of two species of primates. The truncated power law (power law with exponential cut off) is the most suitable random walk model that characterizes the emergent movement patterns of these primates. In Section 3, an agent-based model is used to simulate search behavior in different environments (landscapes) to investigate the impact of the resource landscape on the optimal foraging movement patterns of deterministic foragers. It should be noted that this model goes beyond previous work in that it includes parameters such as spatial memory and satiation, which have received little consideration to date in the field of movement ecology. When the food availability is scarce in a tropical forest-like environment with feeding trees distributed in a clumped fashion and the size of those trees are distributed according to a lognormal distribution, the optimal foraging pattern of a generalist who can consume various and abundant food types indeed reaches the Lévy range, and hence, show evidence for Lévy-flight-like (power law distribution with exponent between 1 and 3) behavior. Section 4 of the dissertation presents an investigation of phase transition behavior in a network of locally coupled self-sustained oscillators as the system passes through various bursting states. The results suggest that a phase transition does not occur for this locally coupled neuronal network. The data analysis in the dissertation adopts a model selection approach and relies on methods based on information theory and maximum likelihood

    Switching Principal Component Analysis for Modeling Means and Covariance Changes Over Time

    Get PDF
    Many psychological theories predict that cognitions, affect, action tendencies, and other variables change across time in mean level as well as in covariance structure. Often such changes are rather abrupt, because they are caused by sudden events. To capture such changes, one may repeatedly measure the variables under study for a single individual and examine whether the resulting multivariate time series contains a number of phases with different means and covariance structures. The latter task is challenging, however. First, in many cases, it is unknown how many phases there are and when new phases start. Second, often a rather large number of variables is involved, complicating the interpretation of the covariance pattern within each phase. To take up this challenge, we present switching principal component analysis (PCA). Switching PCA detects phases of consecutive observations or time points (in single subject data) with similar means and/or covariation structures, and performs a PCA per phase to yield insight into its covariance structure. An algorithm for fitting switching PCA solutions as well as a model selection procedure are presented and evaluated in a simulation study. Finally, we analyze empirical data on cardiorespiratory recordings

    Inferences about the conservation utility of using unmanned aerial vehicles to conduct rapid assessments for basking freshwater turtles

    Get PDF
    Unmanned aerial vehicles (UAVs), an emerging technology, show promise in ecological research. In this comparative study, I compare UAVs to a traditional sampling method, observations using spotting scopes. UAVs have yet to be used successfully for sampling freshwater turtles; however, they have been used with mixed success for monitoring mammals and birds. Herein, I propose that the conservation utility of UAVs be formally assessed in the field prior to them being used to make adaptive conservation and management decisions. I quantitatively and qualitatively evaluate the use of UAVs using a mixed methods approach in contrast to a proven field method as a means to elucidate our basic understanding of presence-absence. Being able to successfully use UAVs for ecological surveying would provide an easy, efficient, and less invasive way to study basking turtles

    HETEROSCEDASTIC DISCRIMINANT ANALYSIS COMBINED WITH FEATURE SELECTION FOR CREDIT SCORING

    Get PDF
    Credit granting is a fundamental question and one of the most complex tasks that every credit institution is faced with. Typically, credit scoring databases are often large and characterized by redundant and irrelevant features. An effective classification model will objectively help managers instead of intuitive experience. This study proposes an approach for building a credit scoring model based on the combination of heteroscedastic extension (Loog, Duin, 2002) of classical Fisher Linear Discriminant Analysis (Fisher, 1936, Krzyśko, 1990) and a feature selection algorithm that retains sufficient information for classification purpose. We have tested five feature subset selection algorithms: two filters and three wrappers. To evaluate the accuracy of the proposed credit scoring model and to compare it with the existing approaches we have used the German credit data set from the study (Chen, Li, 2010). The results of our study suggest that the proposed hybrid approach is an effective and promising method for building credit scoring models

    Dissecting the assembly process of benthic communities from neotropical streams

    Get PDF
    § La conservación y rehabilitación de la estructura y funcionamiento de los ecosistemas requiere de un conocimiento profundo de las causas y consecuencias de su biodiversidad. Sin embargo, este conocimiento es aún escaso en regiones Neotropicales. § En esta tesis, utilizo el marco conceptual de las metacomunidades para caracterizar los posibles efectos que tienen los procesos de dispersión (en tiempos ecológicos y evolutivos), selección (impulsada por factores abióticos) y deriva ecológica en la diversidad y distribución de las comunidades bentónicas fluviales. 2 § La zona de estudio, de un área de aproximadamente 40,000 km , abarcó entre 26 y 32 segmentos de ríos prístinos del Orinoco colombiano. Los puntos de muestreo abarcaron un gradiente de elevación de 300 a 3400 m.s.n.m. que incluyó un conjunto heterogéneo de ecorregiones y paisajes. § Mediante una aproximación de ligar patrones y posibles mecanismos, esta tesis proporciona evidencias de que la dispersión, la selección y la deriva están directamente involucradas en el proceso de ensamblaje de las comunidades bentónicas fluviales. § Mis hallazgos indican que uno o más eventos de limitación de la dispersión en un marco de tiempo evolutivo (eventos de aislamiento alopátrico) formaron diferentes pools de especies dentro de la cuenca del Orinoco. La extensión de estos pools coincide parcialmente con la distribución de las ecorregiones, lo que sugiere que los eventos que moldearon los paisajes fluviales y la estructura de la vegetación afectaron de manera similar la diversidad y distribución de las especies bentónicas en ecosistemas fluviales. § Adicionalmente, dentro de cada ecorregión, la dispersión, la selección y la deriva están interactuando para restringir la estructura y la dinámica de las comunidades y metacomunidades entre y dentro de los ríos. Dependiendo de la comunidad (p.e. diatomeas o insectos), el papel de cada uno de estos procesos puede prevalecer sobre el de los demás. § Estos hallazgos tienen implicaciones tanto para la investigación básica como para la aplicada (p.e. biomonitoreo) en las disciplinas de la ecología de metacomunidades y de agua dulce, así como en la conservación y la biogeografía.§ The conservation and rehabilitation of ecosystem structure and functioning requires of a deep knowledge on the causes and consequences of its biodiversity. The assembly of Neotropical communities, particularly in riverine ecosystems, remains to be dissected. § I used the metacommunity framework to dissect the relative influences of dispersal (in ecological and evolutionary timeframes), selection (driven by abiotic factors) and ecological drift on the assembly process of freshwater benthic communities. § The study was carried out at 26-32 different stream segments within an area of 2 about 40,000km , in the Colombian Orinoco. The area encompasses an elevation gradient from 3400 to 300m a.s.l. and includes a heterogeneous assembly of ecoregions and landscapes. § By using a pattern-matching approach, I provide evidences supporting that dispersal, selection and drift are directly involved in the assembly of freshwater benthic communities. § My findings indicate that one or more events of dispersal limitation (i.e. allopatric isolation) in an evolutionary timeframe shaped distinct pools of taxa in the Orinoco basin. The extent of these pools partially matches the distribution of the ecoregions, suggesting that those events molding the riverscapes and the vegetation structure similarly affect the diversity and distribution of benthic species. § Within each ecoregion, dispersal, selection and drift interact to constrain the structure and dynamics of communities and metacommunities among and within streams. Depending on the taxa belonging to each pool of species, the role of one of these processes may prevail over the others. § These findings have implications for both basic and applied research in the disciplines of metacommunity and freshwater ecology as well as of conservation and biogeography.Linking functional diversity patterns of algae and invertebrates to scale-dependent constrains of rivers from the Orinoco basinTesis con fines de doble titulación bajo el Convenio de cotutela entre la Universidad de Girona y la Universidad Nacional de Colombia.Doctorad
    corecore