1,084 research outputs found

    Fast Association Tests for Genes with FAST

    Get PDF
    Gene-based tests of association can increase the power of a genome-wide association study by aggregating multiple independent effects across a gene or locus into a single stronger signal. Recent gene-based tests have distinct approaches to selecting which variants to aggregate within a locus, modeling the effects of linkage disequilibrium, representing fractional allele counts from imputation, and managing permutation tests for p-values. Implementing these tests in a single, efficient framework has great practical value. Fast ASsociation Tests (Fast) addresses this need by implementing leading gene-based association tests together with conventional SNP-based univariate tests and providing a consolidated, easily interpreted report. Fast scales readily to genome-wide SNP data with millions of SNPs and tens of thousands of individuals, provides implementations that are orders of magnitude faster than original literature reports, and provides a unified framework for performing several gene based association tests concurrently and efficiently on the same data. Availability: https://bitbucket.org/baderlab/fast/downloads/FAST.tar.gz, with documentation at https://bitbucket.org/baderlab/fast/wiki/Hom

    Benchmark Comparison of Cloud Analytics Methods Applied to Earth Observations

    Get PDF
    Earth Observation data are a vital resource for studying long term changes, but the large data volumes can be challenging to analyze. Time series analysis in particular is hampered by the typical thin-time-slice file organization. We examine several potential solutions inspired in large part by the data-parallel methods that have arisen with cloud computing. These solutions include various combinations of data re-organization, spatial indexing, distributed storage and pre-computation that we term "Analytics Optimized Data Stores" (AODS). We find that even simple solutions (such as a data cube) produce more than an order of magnitude improvement; the best provide two to three orders of magnitude improvement. The most performant solutions have tradeoffs in terms of generality or storage footprint, but may nonetheless be useful components in data analytics frameworks where performance is critical

    Meta-analysis of GWAS of over 16,000 individuals with autism spectrum disorder highlights a novel locus at 10q24.32 and a significant overlap with schizophrenia

    Get PDF
    Background Over the past decade genome-wide association studies (GWAS) have been applied to aid in the understanding of the biology of traits. The success of this approach is governed by the underlying effect sizes carried by the true risk variants and the corresponding statistical power to observe such effects given the study design and sample size under investigation. Previous ASD GWAS have identified genome-wide significant (GWS) risk loci; however, these studies were of only of low statistical power to identify GWS loci at the lower effect sizes (odds ratio (OR) <1.15). Methods We conducted a large- scale coordinated international collaboration to combine independent genotyping data to improve the statistical power and aid in robust discovery of GWS loci. This study uses genome-wide genotyping data from a discovery sample (7387 ASD cases and 8567 controls) followed by meta-analysis of summary statistics from two replication sets (7783 ASD cases and 11359 controls; and 1369 ASD cases and 137308 controls). Results We observe a GWS locus at 10q24.32 that overlaps several genes including PITX3, which encodes a transcription factor identified as playing a role in neuronal differentiation and CUEDC2 previously reported to be associated with social skills in an independent population cohort. We also observe overlap with regions previously implicated in schizophrenia which was further supported by a strong genetic correlation between these disorders (Rg = 0.23; P = 9 × 10−6). We further combined these Psychiatric Genomics Consortium (PGC) ASD GWAS data with the recent PGC schizophrenia GWAS to identify additional regions which may be important in a common neurodevelopmental phenotype and identified 12 novel GWS loci. These include loci previously implicated in ASD such as FOXP1 at 3p13, ATP2B2 at 3p25.3, and a ‘neurodevelopmental hub’ on chromosome 8p11.23. Conclusions This study is an important step in the ongoing endeavour to identify the loci which underpin the common variant signal in ASD. In addition to novel GWS loci, we have identified a significant genetic correlation with schizophrenia and association of ASD with several neurodevelopmental-related genes such as EXT1, ASTN2, MACROD2, and HDAC4

    Microbial production of long-chain n-alkanes: Implication for interpreting sedimentary leaf wax signals

    Get PDF
    Relative distributions as well as compound-specific carbon and hydrogen isotope ratios of long-chain C-25 to C-33 n-alkanes in sediments provide important paleoclimate and paleoenvironmental information. These compounds in aquatic sediments are generally attributed to leaf waxes produced by higher plants. However, whether microbes, such as fungi and bacteria, can make a significant contribution to sedimentary long-chain n-alkanes is uncertain, with only scattered reports in the early 1960s to 1970s that microbes can produce long-chain n-alkanes. Given the rapidly expanding importance of leaf waxes in paleoclimate and paleoenvironmental studies, the impact of microbial contribution to long-chain n-alkanes in sediments must be fully addressed. In this study, we performed laboratory incubation of peat-land soils under both anaerobic and aerobic conditions in the absence of light with deuterium-enriched water over 1.5 years and analyzed compound-specific hydrogen isotopic ratios of n-alkanes. Under aerobic conditions, we find n-alkanes of different chain length display variable degrees of hydrogen isotopic enrichments, with short-chain (C-18-C-21) n-alkanes showing the greatest enrichment, followed by long-chain &quot;leaf wax&quot; (C-27-C-31) n-alkanes, and minimal or no enrichment for mid-chain (C-22-C-25) n-alkanes. In contrast, only the shorter chain (C-18 and C-19) n-alkanes display appreciable isotopic enrichment under anaerobic conditions. The degrees of isotopic enrichment for individual n-alkanes allow for a quantitative assessment of microbial contributions to n-alkanes. Overall our results show the microbial contribution to long-chain n-alkanes can reach up to 0.1% per year in aerobic conditions. For shorter chain n-alkanes, up to 2.5% per year could be produced by microbes in aerobic and anaerobic conditions respectively. Our results indicate that prolonged exposure to aerobic conditions can lead to substantial accumulation of microbially derived long-chain n-alkanes in sediments while original n-alkanes of leaf wax origin are degraded; hence caution must be exercised when interpreting sedimentary records of long-chain n-alkanes, including chain length distributions and isotopic ratios. (c) 2017 Elsevier Ltd. All rights reserved

    IDEA: Interactive DoublE Attentions from Label Embedding for Text Classification

    Full text link
    Current text classification methods typically encode the text merely into embedding before a naive or complicated classifier, which ignores the suggestive information contained in the label text. As a matter of fact, humans classify documents primarily based on the semantic meaning of the subcategories. We propose a novel model structure via siamese BERT and interactive double attentions named IDEA ( Interactive DoublE Attentions) to capture the information exchange of text and label names. Interactive double attentions enable the model to exploit the inter-class and intra-class information from coarse to fine, which involves distinguishing among all labels and matching the semantical subclasses of ground truth labels. Our proposed method outperforms the state-of-the-art methods using label texts significantly with more stable results.Comment: Accepted by ICTAI202

    The Impact of Gamification Design on the Success of Health and Fitness Apps

    Get PDF
    Gamification has been increasingly employed in health-related apps in recent years. However, the effect of gamification design on the success of health and fitness apps remains unknown and has not been investigated before. This study attempts to identify what gamification elements are frequently used in the design of health and fitness apps and to empirically quantify their effects on app downloads and user ratings of these apps. We construct a rich dataset that includes information about the daily downloads, ratings and gamification design elements of 2,462 health and fitness apps on the Apple App Store. Our sample contains 924 paid apps and 1,538 free apps. This study contributes to both the gamification and mobile app literatures and provides important implications for app developers who intend to adopt gamification in mobile app design

    A comprehensive census of microbial diversity in hot springs of Tengchong, Yunnan Province China using 16S rRNA gene pyrosequencing

    Get PDF
    The Rehai and Ruidian geothermal fields, located in Tengchong County, Yunnan Province, China, host a variety of geochemically distinct hot springs. In this study, we report a comprehensive, cultivation-independent census of microbial communities in 37 samples collected from these geothermal fields, encompassing sites ranging in temperature from 55.1 to 93.6°C, in pH from 2.5 to 9.4, and in mineralogy from silicates in Rehai to carbonates in Ruidian. Richness was low in all samples, with 21–123 species-level OTUs detected. The bacterial phylum Aquificae or archaeal phylum Crenarchaeota were dominant in Rehai samples, yet the dominant taxa within those phyla depended on temperature, pH, and geochemistry. Rehai springs with low pH (2.5–2.6), high temperature (85.1–89.1°C), and high sulfur contents favored the crenarchaeal order Sulfolobales, whereas those with low pH (2.6–4.8) and cooler temperature (55.1–64.5°C) favored the Aquificae genus Hydrogenobaculum. Rehai springs with neutral-alkaline pH (7.2–9.4) and high temperature (>80°C) with high concentrations of silica and salt ions (Na, K, and Cl) favored the Aquificae genus Hydrogenobacter and crenarchaeal orders Desulfurococcales and Thermoproteales. Desulfurococcales and Thermoproteales became predominant in springs with pH much higher than the optimum and even the maximum pH known for these orders. Ruidian water samples harbored a single Aquificae genus Hydrogenobacter, whereas microbial communities in Ruidian sediment samples were more diverse at the phylum level and distinctly different from those in Rehai and Ruidian water samples, with a higher abundance of uncultivated lineages, close relatives of the ammonia-oxidizing archaeon “Candidatus Nitrosocaldus yellowstonii”, and candidate division O1aA90 and OP1. These differences between Ruidian sediments and Rehai samples were likely caused by temperature, pH, and sediment mineralogy. The results of this study significantly expand the current understanding of the microbiology in Tengchong hot springs and provide a basis for comparison with other geothermal systems around the world
    corecore