17,202 research outputs found
Statistical interaction modeling of bovine herd behaviors
While there has been interest in modeling the group behavior of herds or flocks, much of this work has focused on simulating their collective spatial motion patterns which have not accounted for individuality in the herd and instead assume a homogenized role for all members or sub-groups of the herd. Animal behavior experts have noted that domestic animals exhibit behaviors that are indicative of social hierarchy: leader/follower type behaviors are present as well as dominance and subordination, aggression and rank order, and specific social affiliations may also exist. Both wild and domestic cattle are social species, and group behaviors are likely to be influenced by the expression of specific social interactions. In this paper, Global Positioning System coordinate fixes gathered from a herd of beef cows tracked in open fields over several days at a time are utilized to learn a model that focuses on the interactions within the herd as well as its overall movement. Using these data in this way explores the validity of existing group behavior models against actual herding behaviors. Domain knowledge, location geography and human observations, are utilized to explain the causes of these deviations from this idealized behavior
Data Imputation through the Identification of Local Anomalies
We introduce a comprehensive and statistical framework in a model free
setting for a complete treatment of localized data corruptions due to severe
noise sources, e.g., an occluder in the case of a visual recording. Within this
framework, we propose i) a novel algorithm to efficiently separate, i.e.,
detect and localize, possible corruptions from a given suspicious data instance
and ii) a Maximum A Posteriori (MAP) estimator to impute the corrupted data. As
a generalization to Euclidean distance, we also propose a novel distance
measure, which is based on the ranked deviations among the data attributes and
empirically shown to be superior in separating the corruptions. Our algorithm
first splits the suspicious instance into parts through a binary partitioning
tree in the space of data attributes and iteratively tests those parts to
detect local anomalies using the nominal statistics extracted from an
uncorrupted (clean) reference data set. Once each part is labeled as anomalous
vs normal, the corresponding binary patterns over this tree that characterize
corruptions are identified and the affected attributes are imputed. Under a
certain conditional independency structure assumed for the binary patterns, we
analytically show that the false alarm rate of the introduced algorithm in
detecting the corruptions is independent of the data and can be directly set
without any parameter tuning. The proposed framework is tested over several
well-known machine learning data sets with synthetically generated corruptions;
and experimentally shown to produce remarkable improvements in terms of
classification purposes with strong corruption separation capabilities. Our
experiments also indicate that the proposed algorithms outperform the typical
approaches and are robust to varying training phase conditions
Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches
Imaging spectrometers measure electromagnetic energy scattered in their
instantaneous field view in hundreds or thousands of spectral channels with
higher spectral resolution than multispectral cameras. Imaging spectrometers
are therefore often referred to as hyperspectral cameras (HSCs). Higher
spectral resolution enables material identification via spectroscopic analysis,
which facilitates countless applications that require identifying materials in
scenarios unsuitable for classical spectroscopic analysis. Due to low spatial
resolution of HSCs, microscopic material mixing, and multiple scattering,
spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus,
accurate estimation requires unmixing. Pixels are assumed to be mixtures of a
few materials, called endmembers. Unmixing involves estimating all or some of:
the number of endmembers, their spectral signatures, and their abundances at
each pixel. Unmixing is a challenging, ill-posed inverse problem because of
model inaccuracies, observation noise, environmental conditions, endmember
variability, and data set size. Researchers have devised and investigated many
models searching for robust, stable, tractable, and accurate unmixing
algorithms. This paper presents an overview of unmixing methods from the time
of Keshava and Mustard's unmixing tutorial [1] to the present. Mixing models
are first discussed. Signal-subspace, geometrical, statistical, sparsity-based,
and spatial-contextual unmixing algorithms are described. Mathematical problems
and potential solutions are described. Algorithm characteristics are
illustrated experimentally.Comment: This work has been accepted for publication in IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensin
Exploring the utility of a GIScience approach to modeling invasive species: A case study of Ailanthus altissima
This thesis investigated the potential for integration of remotely sensed and GIS data into an agent-based modeling environment in order to model seed dispersal and subsequent establishment of windborne seeds. In order to explore the applicability of agent-based modeling to predicting seed dispersal, a case study was carried out using the representative example species Ailanthus altissima, an invasive tree found throughout North America\u27s temperate regions. Seed movement was modeled in two stages, primary and secondary dispersal; primary dispersal was calibrated using existing field data, while secondary dispersal was calibrated only qualitatively. Establishment potential was accounted for probabilistically, based on landuse type. Environmental controls on seed movement and establishment were accounted for with several remotely sensed datasets. The general model characteristics and structure are representative of a potential class of predictive models that incorporate raster data and vector-based seed movement. Agent-based modeling provides a link between raster and vector data and processing methods, and is therefore a potential tool for projects involving both raster and vector data types as well as vector processing. Because seed dispersal and establishment modeling benefits from incorporating both of these data types, it was found that the agent-based approach provided an appropriate framework for modeling the phenomenon, while further research is necessary to fully parameterize and field-validate the model
Populations in statistical genetic modelling and inference
What is a population? This review considers how a population may be defined
in terms of understanding the structure of the underlying genetics of the
individuals involved. The main approach is to consider statistically
identifiable groups of randomly mating individuals, which is well defined in
theory for any type of (sexual) organism. We discuss generative models using
drift, admixture and spatial structure, and the ancestral recombination graph.
These are contrasted with statistical models for inference, principle component
analysis and other `non-parametric' methods. The relationships between these
approaches are explored with both simulated and real-data examples. The
state-of-the-art practical software tools are discussed and contrasted. We
conclude that populations are a useful theoretical construct that can be well
defined in theory and often approximately exist in practice
Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?
The organization and mining of malaria genomic and post-genomic data is
highly motivated by the necessity to predict and characterize new biological
targets and new drugs. Biological targets are sought in a biological space
designed from the genomic data from Plasmodium falciparum, but using also the
millions of genomic data from other species. Drug candidates are sought in a
chemical space containing the millions of small molecules stored in public and
private chemolibraries. Data management should therefore be as reliable and
versatile as possible. In this context, we examined five aspects of the
organization and mining of malaria genomic and post-genomic data: 1) the
comparison of protein sequences including compositionally atypical malaria
sequences, 2) the high throughput reconstruction of molecular phylogenies, 3)
the representation of biological processes particularly metabolic pathways, 4)
the versatile methods to integrate genomic data, biological representations and
functional profiling obtained from X-omic experiments after drug treatments and
5) the determination and prediction of protein structures and their molecular
docking with drug candidate structures. Progresses toward a grid-enabled
chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa
- …