124,003 research outputs found
A fast algorithm for detecting gene-gene interactions in genome-wide association studies
With the recent advent of high-throughput genotyping techniques, genetic data
for genome-wide association studies (GWAS) have become increasingly available,
which entails the development of efficient and effective statistical
approaches. Although many such approaches have been developed and used to
identify single-nucleotide polymorphisms (SNPs) that are associated with
complex traits or diseases, few are able to detect gene-gene interactions among
different SNPs. Genetic interactions, also known as epistasis, have been
recognized to play a pivotal role in contributing to the genetic variation of
phenotypic traits. However, because of an extremely large number of SNP-SNP
combinations in GWAS, the model dimensionality can quickly become so
overwhelming that no prevailing variable selection methods are capable of
handling this problem. In this paper, we present a statistical framework for
characterizing main genetic effects and epistatic interactions in a GWAS study.
Specifically, we first propose a two-stage sure independence screening (TS-SIS)
procedure and generate a pool of candidate SNPs and interactions, which serve
as predictors to explain and predict the phenotypes of a complex trait. We also
propose a rates adjusted thresholding estimation (RATE) approach to determine
the size of the reduced model selected by an independence screening.
Regularization regression methods, such as LASSO or SCAD, are then applied to
further identify important genetic effects. Simulation studies show that the
TS-SIS procedure is computationally efficient and has an outstanding finite
sample performance in selecting potential SNPs as well as gene-gene
interactions. We apply the proposed framework to analyze an
ultrahigh-dimensional GWAS data set from the Framingham Heart Study, and select
23 active SNPs and 24 active epistatic interactions for the body mass index
variation. It shows the capability of our procedure to resolve the complexity
of genetic control.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS771 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A broad-coverage distributed connectionist model of visual word recognition
In this study we describe a distributed connectionist model of morphological processing, covering a realistically sized sample of the English language. The purpose of this model is to explore how effects of discrete, hierarchically structured morphological paradigms, can arise as a result of the statistical sub-regularities in the mapping between
word forms and word meanings. We present a model that learns to produce at its output a realistic semantic representation of a word, on presentation of a distributed representation of its orthography. After training, in three experiments, we compare the outputs of the model with the lexical decision latencies for large sets of English nouns and verbs. We show that the model has developed detailed representations of morphological structure, giving rise to effects analogous to those observed in visual lexical decision experiments. In addition, we show how the association between word form and word meaning also
give rise to recently reported differences between regular and irregular verbs, even in their completely regular present-tense forms. We interpret these results as underlining the key importance for lexical processing of the statistical regularities in the mappings between form and meaning
Ionic behavior assessment of surface-active compounds from corn steep liquor by exchange resins
Depending on their ionic nature, biosurfactants can be classified as nonionic, anionic, cationic, or amphoteric. The ionic behavior of biosurfactants is an important characteristic that dictates their use in industrial applications. In this work, a biosurfactant extract obtained from corn steep liquor was subjected to anionic or cationic resins, in order to study the ionic behavior under different operational conditions using response surface methodology. The independent variables included in the study are the dilution of biosurfactant solution, the amount of cationic or anionic resin, and the extraction time, whereas the dependent variables studied consisted of the surface tension of biosurfactant aqueous solution, after contacting with anionic or cationic resin. The results showed that biosurfactant extracted from corn steep liquor is amphoteric, since both resins were able to entrap this biosurfactant, making it particularly suited for use in personal care preparations for sensitive skin.Peer ReviewedPostprint (author's final draft
The Omega Counter, a Frequency Counter Based on the Linear Regression
This article introduces the {\Omega} counter, a frequency counter -- or a
frequency-to-digital converter, in a different jargon -- based on the Linear
Regression (LR) algorithm on time stamps. We discuss the noise of the
electronics. We derive the statistical properties of the {\Omega} counter on
rigorous mathematical basis, including the weighted measure and the frequency
response. We describe an implementation based on a SoC, under test in our
laboratory, and we compare the {\Omega} counter to the traditional {\Pi} and
{\Lambda} counters. The LR exhibits optimum rejection of white phase noise,
superior to that of the {\Pi} and {\Lambda} counters. White noise is the major
practical problem of wideband digital electronics, both in the instrument
internal circuits and in the fast processes which we may want to measure. The
{\Omega} counter finds a natural application in the measurement of the
Parabolic Variance, described in the companion article arXiv:1506.00687
[physics.data-an].Comment: 8 pages, 6 figure, 2 table
Investment efficiency and audit fee from the perspective of the role of financial distress
Purpose: The aim of the article is to present the authorâs methodological proposal in the field of management and development planning, taking the opinions of the commune inhabitants. Design/Methodology/Approach: The statistical population of the study has included all listed companies in Tehran Stock Exchange. After sampling 141 companies were studied using data from 2011 to 2018 using the multiple regression method. Findings: The results show that there was a significant relationship between investment efficiency and audit fee, and financial distress had a significant effect on the relationship between investment efficiency and audit fee. Practical Implications: The managers working in Iran have greater confidence than firms to use auditors who receives less audit fee and the companies in a climate of financial distress have overconfident managers. Originality/Value: Since no empirical research has been conducted to study the aforementioned variables in Iran, the present study is innovative in this respect. Also the results are also applicable to other underdeveloped countries in the Middle East.peer-reviewe
Liquidity commonality does not imply liquidity resilience commonality: A functional characterisation for ultra-high frequency cross-sectional LOB data
We present a large-scale study of commonality in liquidity and resilience
across assets in an ultra high-frequency (millisecond-timestamped) Limit Order
Book (LOB) dataset from a pan-European electronic equity trading facility. We
first show that extant work in quantifying liquidity commonality through the
degree of explanatory power of the dominant modes of variation of liquidity
(extracted through Principal Component Analysis) fails to account for heavy
tailed features in the data, thus producing potentially misleading results. We
employ Independent Component Analysis, which both decorrelates the liquidity
measures in the asset cross-section, but also reduces higher-order statistical
dependencies.
To measure commonality in liquidity resilience, we utilise a novel
characterisation as the time required for return to a threshold liquidity
level. This reflects a dimension of liquidity that is not captured by the
majority of liquidity measures and has important ramifications for
understanding supply and demand pressures for market makers in electronic
exchanges, as well as regulators and HFTs. When the metric is mapped out across
a range of thresholds, it produces the daily Liquidity Resilience Profile (LRP)
for a given asset. This daily summary of liquidity resilience behaviour from
the vast LOB dataset is then amenable to a functional data representation. This
enables the comparison of liquidity resilience in the asset cross-section via
functional linear sub-space decompositions and functional regression. The
functional regression results presented here suggest that market factors for
liquidity resilience (as extracted through functional principal components
analysis) can explain between 10 and 40% of the variation in liquidity
resilience at low liquidity thresholds, but are less explanatory at more
extreme levels, where individual asset factors take effect
The host genotype affects the bacterial community in the human gastrointestinal tract
The gastrointestinal (GI) tract is one of the most complex ecosystems consisting of microbial and host cells. It is suggested that the host genotype, the physiology of the host and environmental factors affect the composition and function of the bacterial community in the intestine. However, the relative impact of these factors is unknown. In this study, we used a culture-independent approach to analyze the bacterial composition in the GI tract. Denaturing gradient gel electrophoresis (DGGE) profiles of fecal bacterial 16S rDNA amplicons from adult humans with varying degrees of genetic relatedness were compared by determining the similarity indices of the profiles compared. The similarity between fecal DGGE profiles of monozygotic twins were significantly higher than those for unrelated individuals (ts = 2.73, p1-tail = 0.0063, df=21). In addition, a positive relationship (F1, 30 = 8.63, p = 0.0063) between the similarity indices and the genetic relatedness of the hosts was observed. In contrast, fecal DGGE profiles of marital partners, which are living in the same environment and which have comparable feeding habits, showed low similarity which was not significantly different from that of unrelated individuals (ts = 1.03, p1-tail = 0.1561, df=27). Our data indicate that factors related to the host genotype have an important effect on determining the bacterial composition in the GI tract
- âŠ