124,003 research outputs found

    A fast algorithm for detecting gene-gene interactions in genome-wide association studies

    Full text link
    With the recent advent of high-throughput genotyping techniques, genetic data for genome-wide association studies (GWAS) have become increasingly available, which entails the development of efficient and effective statistical approaches. Although many such approaches have been developed and used to identify single-nucleotide polymorphisms (SNPs) that are associated with complex traits or diseases, few are able to detect gene-gene interactions among different SNPs. Genetic interactions, also known as epistasis, have been recognized to play a pivotal role in contributing to the genetic variation of phenotypic traits. However, because of an extremely large number of SNP-SNP combinations in GWAS, the model dimensionality can quickly become so overwhelming that no prevailing variable selection methods are capable of handling this problem. In this paper, we present a statistical framework for characterizing main genetic effects and epistatic interactions in a GWAS study. Specifically, we first propose a two-stage sure independence screening (TS-SIS) procedure and generate a pool of candidate SNPs and interactions, which serve as predictors to explain and predict the phenotypes of a complex trait. We also propose a rates adjusted thresholding estimation (RATE) approach to determine the size of the reduced model selected by an independence screening. Regularization regression methods, such as LASSO or SCAD, are then applied to further identify important genetic effects. Simulation studies show that the TS-SIS procedure is computationally efficient and has an outstanding finite sample performance in selecting potential SNPs as well as gene-gene interactions. We apply the proposed framework to analyze an ultrahigh-dimensional GWAS data set from the Framingham Heart Study, and select 23 active SNPs and 24 active epistatic interactions for the body mass index variation. It shows the capability of our procedure to resolve the complexity of genetic control.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS771 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A broad-coverage distributed connectionist model of visual word recognition

    Get PDF
    In this study we describe a distributed connectionist model of morphological processing, covering a realistically sized sample of the English language. The purpose of this model is to explore how effects of discrete, hierarchically structured morphological paradigms, can arise as a result of the statistical sub-regularities in the mapping between word forms and word meanings. We present a model that learns to produce at its output a realistic semantic representation of a word, on presentation of a distributed representation of its orthography. After training, in three experiments, we compare the outputs of the model with the lexical decision latencies for large sets of English nouns and verbs. We show that the model has developed detailed representations of morphological structure, giving rise to effects analogous to those observed in visual lexical decision experiments. In addition, we show how the association between word form and word meaning also give rise to recently reported differences between regular and irregular verbs, even in their completely regular present-tense forms. We interpret these results as underlining the key importance for lexical processing of the statistical regularities in the mappings between form and meaning

    Ionic behavior assessment of surface-active compounds from corn steep liquor by exchange resins

    Get PDF
    Depending on their ionic nature, biosurfactants can be classified as nonionic, anionic, cationic, or amphoteric. The ionic behavior of biosurfactants is an important characteristic that dictates their use in industrial applications. In this work, a biosurfactant extract obtained from corn steep liquor was subjected to anionic or cationic resins, in order to study the ionic behavior under different operational conditions using response surface methodology. The independent variables included in the study are the dilution of biosurfactant solution, the amount of cationic or anionic resin, and the extraction time, whereas the dependent variables studied consisted of the surface tension of biosurfactant aqueous solution, after contacting with anionic or cationic resin. The results showed that biosurfactant extracted from corn steep liquor is amphoteric, since both resins were able to entrap this biosurfactant, making it particularly suited for use in personal care preparations for sensitive skin.Peer ReviewedPostprint (author's final draft

    The Omega Counter, a Frequency Counter Based on the Linear Regression

    Full text link
    This article introduces the {\Omega} counter, a frequency counter -- or a frequency-to-digital converter, in a different jargon -- based on the Linear Regression (LR) algorithm on time stamps. We discuss the noise of the electronics. We derive the statistical properties of the {\Omega} counter on rigorous mathematical basis, including the weighted measure and the frequency response. We describe an implementation based on a SoC, under test in our laboratory, and we compare the {\Omega} counter to the traditional {\Pi} and {\Lambda} counters. The LR exhibits optimum rejection of white phase noise, superior to that of the {\Pi} and {\Lambda} counters. White noise is the major practical problem of wideband digital electronics, both in the instrument internal circuits and in the fast processes which we may want to measure. The {\Omega} counter finds a natural application in the measurement of the Parabolic Variance, described in the companion article arXiv:1506.00687 [physics.data-an].Comment: 8 pages, 6 figure, 2 table

    Investment efficiency and audit fee from the perspective of the role of financial distress

    Get PDF
    Purpose: The aim of the article is to present the author’s methodological proposal in the field of management and development planning, taking the opinions of the commune inhabitants. Design/Methodology/Approach: The statistical population of the study has included all listed companies in Tehran Stock Exchange. After sampling 141 companies were studied using data from 2011 to 2018 using the multiple regression method. Findings: The results show that there was a significant relationship between investment efficiency and audit fee, and financial distress had a significant effect on the relationship between investment efficiency and audit fee. Practical Implications: The managers working in Iran have greater confidence than firms to use auditors who receives less audit fee and the companies in a climate of financial distress have overconfident managers. Originality/Value: Since no empirical research has been conducted to study the aforementioned variables in Iran, the present study is innovative in this respect. Also the results are also applicable to other underdeveloped countries in the Middle East.peer-reviewe

    Liquidity commonality does not imply liquidity resilience commonality: A functional characterisation for ultra-high frequency cross-sectional LOB data

    Full text link
    We present a large-scale study of commonality in liquidity and resilience across assets in an ultra high-frequency (millisecond-timestamped) Limit Order Book (LOB) dataset from a pan-European electronic equity trading facility. We first show that extant work in quantifying liquidity commonality through the degree of explanatory power of the dominant modes of variation of liquidity (extracted through Principal Component Analysis) fails to account for heavy tailed features in the data, thus producing potentially misleading results. We employ Independent Component Analysis, which both decorrelates the liquidity measures in the asset cross-section, but also reduces higher-order statistical dependencies. To measure commonality in liquidity resilience, we utilise a novel characterisation as the time required for return to a threshold liquidity level. This reflects a dimension of liquidity that is not captured by the majority of liquidity measures and has important ramifications for understanding supply and demand pressures for market makers in electronic exchanges, as well as regulators and HFTs. When the metric is mapped out across a range of thresholds, it produces the daily Liquidity Resilience Profile (LRP) for a given asset. This daily summary of liquidity resilience behaviour from the vast LOB dataset is then amenable to a functional data representation. This enables the comparison of liquidity resilience in the asset cross-section via functional linear sub-space decompositions and functional regression. The functional regression results presented here suggest that market factors for liquidity resilience (as extracted through functional principal components analysis) can explain between 10 and 40% of the variation in liquidity resilience at low liquidity thresholds, but are less explanatory at more extreme levels, where individual asset factors take effect

    The host genotype affects the bacterial community in the human gastrointestinal tract

    Get PDF
    The gastrointestinal (GI) tract is one of the most complex ecosystems consisting of microbial and host cells. It is suggested that the host genotype, the physiology of the host and environmental factors affect the composition and function of the bacterial community in the intestine. However, the relative impact of these factors is unknown. In this study, we used a culture-independent approach to analyze the bacterial composition in the GI tract. Denaturing gradient gel electrophoresis (DGGE) profiles of fecal bacterial 16S rDNA amplicons from adult humans with varying degrees of genetic relatedness were compared by determining the similarity indices of the profiles compared. The similarity between fecal DGGE profiles of monozygotic twins were significantly higher than those for unrelated individuals (ts = 2.73, p1-tail = 0.0063, df=21). In addition, a positive relationship (F1, 30 = 8.63, p = 0.0063) between the similarity indices and the genetic relatedness of the hosts was observed. In contrast, fecal DGGE profiles of marital partners, which are living in the same environment and which have comparable feeding habits, showed low similarity which was not significantly different from that of unrelated individuals (ts = 1.03, p1-tail = 0.1561, df=27). Our data indicate that factors related to the host genotype have an important effect on determining the bacterial composition in the GI tract
    • 

    corecore