51 research outputs found

    Adaptive main-memory indexing for high-performance point-polygon joins

    Get PDF
    Connected mobility applications rely heavily on geospatial joins that associate point data, such as locations of Uber cars, to static polygonal regions, such as city neighborhoods. These joins typically involve expensive geometric computations, which makes it hard to provide an interactive user experience. In this paper, we propose an adaptive polygon index that leverages true hit fltering to avoid expensive geometric computations in most cases. In particular, our approach closely approximates polygons by combining quadtrees with true hit filtering, and stores these approximations in a query-effcient radix tree. Based on this index, we introduce two geospatial join algorithms: an approximate one that guarantees a user-defined precision, and an exact one that adapts to the expected point distribution. In summary, our technique outperforms existing CPU-based joins by up to two orders of magnitude and is competitive with state-of-the-art GPU implementations

    Unsupervised protein embeddings outperform hand-crafted sequence and structure features at predicting molecular function

    Get PDF
    This work was supported by Keygene N.V., a crop innovation company in the Netherlands and by the Spanish MINECO/FEDER Project TEC201680141-P with the associated FPI grant BES-2017-079792.The authors thank Dr. Elvin Isufi and Chirag Raman for their valuable comments and feedback.Motivation: Protein function prediction is a difficult bioinformatics problem. Many recent methods use deep neural networks to learn complex sequence representations and predict function from these. Deep supervised models require a lot of labeled training data which are not available for this task. However, a very large amount of protein sequences without functional labels is available. Results: We applied an existing deep sequence model that had been pretrained in an unsupervised setting on the supervised task of protein molecular function prediction. We found that this complex feature representation is effective for this task, outperforming hand-crafted features such as one-hot encoding of amino acids, k-mer counts, secondary structure and backbone angles. Also, it partly negates the need for complex prediction models, as a two-layer perceptron was enough to achieve competitive performance in the third Critical Assessment of Functional Annotation benchmark. We also show that combining this sequence representation with protein 3D structure information does not lead to performance improvement, hinting that 3D structure is also potentially learned during the unsupervised pretraining.Keygene N.V., a crop innovation company in the NetherlandsSpanish MINECO/FEDER TEC201680141-PFPI grant BES-2017-07979

    The Mantle Transition Zone Beneath West Antarctica: Seismic Evidence for Hydration and Thermal Upwellings

    Get PDF
    Although prior work suggests that a mantle plume is associated with Cenozoic rifting and volcanism in West Antarctica, the existence of a plume remains conjectural. Here we use P wave receiver functions (PRFs) from the Antarctic POLENET array to estimate mantle transition zone thickness, which is sensitive to temperature perturbations, throughout previously unstudied parts of West Antarctica. We obtain over 8000 high-quality PRFs using an iterative, time domain deconvolution method filtered with a Gaussian width of 0.5 and 1.0, corresponding to frequencies less than ∼0.24 and ∼0.48 Hz, respectively. Single-station and common conversion point stacks, migrated to depth using the AK135 velocity model, indicate that mantle transition zone thickness throughout most of West Antarctica does not differ significantly from the global average, except in two locations; one small region exhibits a vertically thinned (210 ± 15 km) transition zone beneath the Ruppert Coast of Marie Byrd Land and another laterally broader region shows slight, vertical thinning (225 ± 25 km) beneath the Bentley Subglacial Trench. We also observe the 520 discontinuity and a prominent negative peak above the mantle transition zone throughout much of West Antarctica. These results suggest that the mantle transition zone may be hotter than average in two places, possibly due to upwelling from the lower mantle, but not broadly across West Antarctica. Furthermore, we propose that the transition zone may be hydrated due to \u3e100 million years of subduction beneath the region during the early Mesozoic

    Grambank reveals the importance of genealogical constraints on linguistic diversity and highlights the impact of language loss

    Get PDF
    While global patterns of human genetic diversity are increasingly well characterized, the diversity of human languages remains less systematically described. Here we outline the Grambank database. With over 400,000 data points and 2,400 languages, Grambank is the largest comparative grammatical database available. The comprehensiveness of Grambank allows us to quantify the relative effects of genealogical inheritance and geographic proximity on the structural diversity of the world's languages, evaluate constraints on linguistic diversity, and identify the world's most unusual languages. An analysis of the consequences of language loss reveals that the reduction in diversity will be strikingly uneven across the major linguistic regions of the world. Without sustained efforts to document and revitalize endangered languages, our linguistic window into human history, cognition and culture will be seriously fragmented.Genealogy versus geography Constraints on grammar Unusual languages Language loss Conclusio

    Benchmarking learned indexes

    No full text

    Neural Relational Inference for Interacting Systems

    No full text
    Interacting systems are prevalent in nature, from dynamical systems in physics to complex societal dynamics. The interplay of components can give rise to complex behavior, which can often be explained using a simple model of the system’s constituent parts. In this work, we introduce the neural relational inference (NRI) model: an unsupervised model that learns to infer interactions while simultaneously learning the dynamics purely from observational data. Our model takes the form of a variational auto-encoder, in which the latent code represents the underlying interaction graph and the reconstruction is based on graph neural networks. In experiments on simulated physical systems, we show that our NRI model can accurately recover ground-truth interactions in an unsupervised manner. We further demonstrate that we can find an interpretable structure and predict complex dynamics in real motion capture and sports tracking data
    corecore