320 research outputs found
Psoriasis prediction from genome-wide SNP profiles
<p>Abstract</p> <p>Background</p> <p>With the availability of large-scale genome-wide association study (GWAS) data, choosing an optimal set of SNPs for disease susceptibility prediction is a challenging task. This study aimed to use single nucleotide polymorphisms (SNPs) to predict psoriasis from searching GWAS data.</p> <p>Methods</p> <p>Totally we had 2,798 samples and 451,724 SNPs. Process for searching a set of SNPs to predict susceptibility for psoriasis consisted of two steps. The first one was to search top 1,000 SNPs with high accuracy for prediction of psoriasis from GWAS dataset. The second one was to search for an optimal SNP subset for predicting psoriasis. The sequential information bottleneck (sIB) method was compared with classical linear discriminant analysis(LDA) for classification performance.</p> <p>Results</p> <p>The best test harmonic mean of sensitivity and specificity for predicting psoriasis by sIB was 0.674(95% CI: 0.650-0.698), while only 0.520(95% CI: 0.472-0.524) was reported for predicting disease by LDA. Our results indicate that the new classifier sIB performs better than LDA in the study.</p> <p>Conclusions</p> <p>The fact that a small set of SNPs can predict disease status with average accuracy of 68% makes it possible to use SNP data for psoriasis prediction.</p
On reminder effects, drop-outs and dominance: evidence from an online experiment on charitable giving
We present the results of an experiment that (a) shows the usefulness of screening out drop-outs and (b) tests whether different methods of payment and reminder intervals affect charitable giving. Following a lab session, participants could make online donations to charity for a total duration of three months. Our procedure justifying the exclusion of drop-outs consists in requiring participants to collect payments in person flexibly and as known in advance and as highlighted to them later. Our interpretation is that participants who failed to collect their positive payments under these circumstances are likely not to satisfy dominance. If we restrict the sample to subjects who did not drop out, but not otherwise, reminders significantly increase the overall amount of charitable giving. We also find that weekly reminders are no more effective than monthly reminders in increasing charitable giving, and that, in our three months duration experiment, standing orders do not increase giving relative to one-off donations
Systems biology via redescription and ontologies (I): finding phase changes with applications to malaria temporal data
Biological systems are complex and often composed of many subtly interacting components. Furthermore, such systems evolve through time and, as the underlying biology executes its genetic program, the relationships between components change and undergo dynamic reorganization. Characterizing these relationships precisely is a challenging task, but one that must be undertaken if we are to understand these systems in sufficient detail. One set of tools that may prove useful are the formal principles of model building and checking, which could allow the biologist to frame these inherently temporal questions in a sufficiently rigorous framework. In response to these challenges, GOALIE (Gene ontology algorithmic logic and information extractor) was developed and has been successfully employed in the analysis of high throughput biological data (e.g. time-course gene-expression microarray data and neural spike train recordings). The method has applications to a wide variety of temporal data, indeed any data for which there exist ontological descriptions. This paper describes the algorithms behind GOALIE and its use in the study of the Intraerythrocytic Developmental Cycle (IDC) of Plasmodium falciparum, the parasite responsible for a deadly form of chloroquine resistant malaria. We focus in particular on the problem of finding phase changes, times of reorganization of transcriptional control
Ballistic matter waves with angular momentum: Exact solutions and applications
An alternative description of quantum scattering processes rests on
inhomogeneous terms amended to the Schroedinger equation. We detail the
structure of sources that give rise to multipole scattering waves of definite
angular momentum, and introduce pointlike multipole sources as their limiting
case. Partial wave theory is recovered for freely propagating particles. We
obtain novel results for ballistic scattering in an external uniform force
field, where we provide analytical solutions for both the scattering waves and
the integrated particle flux. Our theory directly applies to p-wave
photodetachment in an electric field. Furthermore, illustrating the effects of
extended sources, we predict some properties of vortex-bearing atom laser beams
outcoupled from a rotating Bose-Einstein condensate under the influence of
gravity.Comment: 42 pages, 8 figures, extended version including photodetachment and
semiclassical theor
The structure of the PapD-PapGII pilin complex reveals an open and flexible P5 pocket
P pili are hairlike polymeric structures that mediate binding of uropathogenic Escherichia coli to the surface of the kidney via the PapG adhesin at their tips. PapG is composed of two domains: a lectin domain at the tip of the pilus followed by a pilin domain that comprises the initial polymerizing subunit of the 1,000-plus-subunit heteropolymeric pilus fiber. Prior to assembly, periplasmic pilin domains bind to a chaperone, PapD. PapD mediates donor strand complementation, in which a beta strand of PapD temporarily completes the pilin domain's fold, preventing premature, nonproductive interactions with other pilin subunits and facilitating subunit folding. Chaperone-subunit complexes are delivered to the outer membrane usher where donor strand exchange (DSE) replaces PapD's donated beta strand with an amino-terminal extension on the next incoming pilin subunit. This occurs via a zip-in-zip-out mechanism that initiates at a relatively accessible hydrophobic space termed the P5 pocket on the terminally incorporated pilus subunit. Here, we solve the structure of PapD in complex with the pilin domain of isoform II of PapG (PapGIIp). Our data revealed that PapGIIp adopts an immunoglobulin fold with a missing seventh strand, complemented in parallel by the G1 PapD strand, typical of pilin subunits. Comparisons with other chaperone-pilin complexes indicated that the interactive surfaces are highly conserved. Interestingly, the PapGIIp P5 pocket was in an open conformation, which, as molecular dynamics simulations revealed, switches between an open and a closed conformation due to the flexibility of the surrounding loops. Our study reveals the structural details of the DSE mechanism
Machine Learning in Automated Text Categorization
The automated categorization (or classification) of texts into predefined
categories has witnessed a booming interest in the last ten years, due to the
increased availability of documents in digital form and the ensuing need to
organize them. In the research community the dominant approach to this problem
is based on machine learning techniques: a general inductive process
automatically builds a classifier by learning, from a set of preclassified
documents, the characteristics of the categories. The advantages of this
approach over the knowledge engineering approach (consisting in the manual
definition of a classifier by domain experts) are a very good effectiveness,
considerable savings in terms of expert manpower, and straightforward
portability to different domains. This survey discusses the main approaches to
text categorization that fall within the machine learning paradigm. We will
discuss in detail issues pertaining to three different problems, namely
document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey
Information transmission in genetic regulatory networks: a review
Genetic regulatory networks enable cells to respond to the changes in
internal and external conditions by dynamically coordinating their gene
expression profiles. Our ability to make quantitative measurements in these
biochemical circuits has deepened our understanding of what kinds of
computations genetic regulatory networks can perform and with what reliability.
These advances have motivated researchers to look for connections between the
architecture and function of genetic regulatory networks. Transmitting
information between network's inputs and its outputs has been proposed as one
such possible measure of function, relevant in certain biological contexts.
Here we summarize recent developments in the application of information theory
to gene regulatory networks. We first review basic concepts in information
theory necessary to understand recent work. We then discuss the functional
complexity of gene regulation which arrises from the molecular nature of the
regulatory interactions. We end by reviewing some experiments supporting the
view that genetic networks responsible for early development of multicellular
organisms might be maximizing transmitted 'positional' information.Comment: Submitted to J Phys: Condens Matter, 31 page
Evolution of Resistance to Targeted Anti-Cancer Therapies during Continuous and Pulsed Administration Strategies
The discovery of small molecules targeted to specific oncogenic pathways has revolutionized anti-cancer therapy. However, such therapy often fails due to the evolution of acquired resistance. One long-standing question in clinical cancer research is the identification of optimum therapeutic administration strategies so that the risk of resistance is minimized. In this paper, we investigate optimal drug dosing schedules to prevent, or at least delay, the emergence of resistance. We design and analyze a stochastic mathematical model describing the evolutionary dynamics of a tumor cell population during therapy. We consider drug resistance emerging due to a single (epi)genetic alteration and calculate the probability of resistance arising during specific dosing strategies. We then optimize treatment protocols such that the risk of resistance is minimal while considering drug toxicity and side effects as constraints. Our methodology can be used to identify optimum drug administration schedules to avoid resistance conferred by one (epi)genetic alteration for any cancer and treatment type
A unified data representation theory for network visualization, ordering and coarse-graining
Representation of large data sets became a key question of many scientific
disciplines in the last decade. Several approaches for network visualization,
data ordering and coarse-graining accomplished this goal. However, there was no
underlying theoretical framework linking these problems. Here we show an
elegant, information theoretic data representation approach as a unified
solution of network visualization, data ordering and coarse-graining. The
optimal representation is the hardest to distinguish from the original data
matrix, measured by the relative entropy. The representation of network nodes
as probability distributions provides an efficient visualization method and, in
one dimension, an ordering of network nodes and edges. Coarse-grained
representations of the input network enable both efficient data compression and
hierarchical visualization to achieve high quality representations of larger
data sets. Our unified data representation theory will help the analysis of
extensive data sets, by revealing the large-scale structure of complex networks
in a comprehensible form.Comment: 13 pages, 5 figure
- …