807 research outputs found
Performance comparison of point and spatial access methods
In the past few years a large number of multidimensional point access methods, also called
multiattribute index structures, has been suggested, all of them claiming good performance. Since no
performance comparison of these structures under arbitrary (strongly correlated nonuniform, short
"ugly") data distributions and under various types of queries has been performed, database
researchers and designers were hesitant to use any of these new point access methods. As shown in
a recent paper, such point access methods are not only important in traditional database applications.
In new applications such as CAD/CIM and geographic or environmental information systems, access
methods for spatial objects are needed. As recently shown such access methods are based on point
access methods in terms of functionality and performance. Our performance comparison naturally
consists of two parts. In part I we w i l l compare multidimensional point access methods, whereas in
part I I spatial access methods for rectangles will be compared. In part I we present a survey and
classification of existing point access methods. Then we carefully select the following four methods
for implementation and performance comparison under seven different data files (distributions) and
various types of queries: the 2-level grid file, the BANG file, the hB-tree and a new scheme, called
the BUDDY hash tree. We were surprised to see one method to be the clear winner which was the
BUDDY hash tree. It exhibits an at least 20 % better average performance than its competitors and is
robust under ugly data and queries. In part I I we compare spatial access methods for rectangles.
After presenting a survey and classification of existing spatial access methods we carefully selected
the following four methods for implementation and performance comparison under six different data
files (distributions) and various types of queries: the R-tree, the BANG file, PLOP hashing and the
BUDDY hash tree. The result presented two winners: the BANG file and the BUDDY hash tree.
This comparison is a first step towards a standardized testbed or benchmark. We offer our data and
query files to each designer of a new point or spatial access method such that he can run his
implementation in our testbed
The combination of spatial access methods and computational geometry in geographic database systems
Query processing of spatial objects: Complexity versus Redundancy
The management of complex spatial objects in applications, such as geography and cartography,
imposes stringent new requirements on spatial database systems, in particular on efficient
query processing. As shown before, the performance of spatial query processing can be improved
by decomposing complex spatial objects into simple components. Up to now, only decomposition
techniques generating a linear number of very simple components, e.g. triangles or trapezoids, have
been considered. In this paper, we will investigate the natural trade-off between the complexity of
the components and the redundancy, i.e. the number of components, with respect to its effect on
efficient query processing. In particular, we present two new decomposition methods generating
a better balance between the complexity and the number of components than previously known
techniques. We compare these new decomposition methods to the traditional undecomposed representation
as well as to the well-known decomposition into convex polygons with respect to their
performance in spatial query processing. This comparison points out that for a wide range of query
selectivity the new decomposition techniques clearly outperform both the undecomposed representation
and the convex decomposition method. More important than the absolute gain in performance
by a factor of up to an order of magnitude is the robust performance of our new decomposition
techniques over the whole range of query selectivity
Plane-Sweep Algorithms for Intersecting Geometric Figures
Coordinated Science Laboratory was formerly known as Control Systems LaboratoryJoint Services Electronics Program / N00014-79-C-0424National Science Foundation / MCS 78-1364
Generalized Analysis of Molecular Variance
Many studies in the fields of genetic epidemiology and applied population genetics are predicated on, or require, an assessment of the genetic background diversity of the individuals chosen for study. A number of strategies have been developed for assessing genetic background diversity. These strategies typically focus on genotype data collected on the individuals in the study, based on a panel of DNA markers. However, many of these strategies are either rooted in cluster analysis techniques, and hence suffer from problems inherent to the assignment of the biological and statistical meaning to resulting clusters, or have formulations that do not permit easy and intuitive extensions. We describe a very general approach to the problem of assessing genetic background diversity that extends the analysis of molecular variance (AMOVA) strategy introduced by Excoffier and colleagues some time ago. As in the original AMOVA strategy, the proposed approach, termed generalized AMOVA (GAMOVA), requires a genetic similarity matrix constructed from the allelic profiles of individuals under study and/or allele frequency summaries of the populations from which the individuals have been sampled. The proposed strategy can be used to either estimate the fraction of genetic variation explained by grouping factors such as country of origin, race, or ethnicity, or to quantify the strength of the relationship of the observed genetic background variation to quantitative measures collected on the subjects, such as blood pressure levels or anthropometric measures. Since the formulation of our test statistic is rooted in multivariate linear models, sets of variables can be related to genetic background in multiple regression-like contexts. GAMOVA can also be used to complement graphical representations of genetic diversity such as tree diagrams (dendrograms) or heatmaps. We examine features, advantages, and power of the proposed procedure and showcase its flexibility by using it to analyze a wide variety of published data sets, including data from the Human Genome Diversity Project, classical anthropometry data collected by Howells, and the International HapMap Project
Genetic analysis of hybridization and introgression between wild mongoose and brown lemurs.
BACKGROUND: Hybrid zones generally represent areas of secondary contact after speciation. The nature of the interaction between genes of individuals in a hybrid zone is of interest in the study of evolutionary processes. In this study, data from nuclear microsatellites and mitochondrial DNA sequences were used to genetically characterize hybridization between wild mongoose lemurs (Eulemur mongoz) and brown lemurs (E. fulvus) at Anjamena in west Madagascar. RESULTS: Two segments of mtDNA have been sequenced and 12 microsatellite loci screened in 162 brown lemurs and mongoose lemurs. Among the mongoose lemur population at Anjamena, we identified two F1 hybrids (one also having the mtDNA haplotype of E. fulvus) and six other individuals with putative introgressed alleles in their genotype. Principal component analysis groups both hybrids as intermediate between E. mongoz and E. fulvus and admixture analyses revealed an admixed genotype for both animals. Paternity testing proved one F1 hybrid to be fertile. Of the eight brown lemurs genotyped, all have either putative introgressed microsatellite alleles and/or the mtDNA haplotype of E. mongoz. CONCLUSION: Introgression is bidirectional for the two species, with an indication that it is more frequent in brown lemurs than in mongoose lemurs. We conclude that this hybridization occurs because mongoose lemurs have expanded their range relatively recently. Introgressive hybridization may play an important role in the unique lemur radiation, as has already been shown in other rapidly evolving animals.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
Region Extraction and Verification for Spatial and Spatio-temporal Databases
Abstract. Newer spatial technologies, such as spatio-temporal databases, geo-sensor networks, and other remote sensing methods, require mecha-nisms to efficiently process spatial data and identify (and in some cases fix) data items that do not conform to rigorously defined spatial data type definitions. In this paper, we propose an O(n lg n) time complexity algorithm that examines a spatial configuration, eliminates any portions of the configuration that violate the definition of spatial regions, and constructs a valid region out of the remaining configuration.
A new common functional coding variant at the DDC gene change renal enzyme activity and modify renal dopamine function.
The intra-renal dopamine (DA) system is highly expressed in the proximal tubule and contributes to Na+ and blood pressure homeostasis, as well as to the development of nephropathy. In the kidney, the enzyme DOPA Decarboxylase (DDC) originating from the circulation. We used a twin/family study design, followed by polymorphism association analysis at DDC locus to elucidate heritable influences on renal DA production. Dense single nucleotide polymorphism (SNP) genotyping across the DDC locus on chromosome 7p12 was analyzed by re-sequencing guided by trait-associated genetic markers to discover the responsible genetic variation. We also characterized kinetics of the expressed DDC mutant enzyme. Systematic polymorphism screening across the 15-Exon DDC locus revealed a single coding variant in Exon-14 that was associated with DA excretion and multiple other renal traits indicating pleiotropy. When expressed and characterized in eukaryotic cells, the 462Gln variant displayed lower Vmax (maximal rate of product formation by an enzyme) (21.3 versus 44.9 nmol/min/mg) and lower Km (substrate concentration at which half-maximal product formation is achieved by an enzyme.)(36.2 versus 46.8 μM) than the wild-type (Arg462) allele. The highly heritable DA excretion trait is substantially influenced by a previously uncharacterized common coding variant (Arg462Gln) at the DDC gene that affects multiple renal tubular and glomerular traits, and predicts accelerated functional decline in chronic kidney disease
Moving Beyond Noninformative Priors: Why and How to Choose Weakly Informative Priors in Bayesian Analyses
Throughout the last two decades, Bayesian statistical methods have proliferated throughout ecology and evolution. Numerous previous references established both philosophical and computational guidelines for implementing Bayesian methods. However, protocols for incorporating prior information, the defining characteristic of Bayesian philosophy, are nearly nonexistent in the ecological literature. Here, I hope to encourage the use of weakly informative priors in ecology and evolution by providing a ‘consumer\u27s guide’ to weakly informative priors. The first section outlines three reasons why ecologists should abandon noninformative priors: 1) common flat priors are not always noninformative, 2) noninformative priors provide the same result as simpler frequentist methods, and 3) noninformative priors suffer from the same high type I and type M error rates as frequentist methods. The second section provides a guide for implementing informative priors, wherein I detail convenient ‘reference’ prior distributions for common statistical models (i.e. regression, ANOVA, hierarchical models). I then use simulations to visually demonstrate how informative priors influence posterior parameter estimates. With the guidelines provided here, I hope to encourage the use of weakly informative priors for Bayesian analyses in ecology. Ecologists can and should debate the appropriate form of prior information, but should consider weakly informative priors as the new ‘default’ prior for any Bayesian model
- …