325 research outputs found
A classification-based framework for predicting and analyzing gene regulatory response
BACKGROUND: We have recently introduced a predictive framework for studying gene transcriptional regulation in simpler organisms using a novel supervised learning algorithm called GeneClass. GeneClass is motivated by the hypothesis that in model organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular microarray experiment based on the presence of binding site subsequences ("motifs") in the gene's regulatory region and the expression levels of regulators such as transcription factors in the experiment ("parents"). GeneClass formulates the learning task as a classification problem — predicting +1 and -1 labels corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. Using the Adaboost algorithm, GeneClass learns a prediction function in the form of an alternating decision tree, a margin-based generalization of a decision tree. METHODS: In the current work, we introduce a new, robust version of the GeneClass algorithm that increases stability and computational efficiency, yielding a more scalable and reliable predictive model. The improved stability of the prediction tree enables us to introduce a detailed post-processing framework for biological interpretation, including individual and group target gene analysis to reveal condition-specific regulation programs and to suggest signaling pathways. Robust GeneClass uses a novel stabilized variant of boosting that allows a set of correlated features, rather than single features, to be included at nodes of the tree; in this way, biologically important features that are correlated with the single best feature are retained rather than decorrelated and lost in the next round of boosting. Other computational developments include fast matrix computation of the loss function for all features, allowing scalability to large datasets, and the use of abstaining weak rules, which results in a more shallow and interpretable tree. We also show how to incorporate genome-wide protein-DNA binding data from ChIP chip experiments into the GeneClass algorithm, and we use an improved noise model for gene expression data. RESULTS: Using the improved scalability of Robust GeneClass, we present larger scale experiments on a yeast environmental stress dataset, training and testing on all genes and using a comprehensive set of potential regulators. We demonstrate the improved stability of the features in the learned prediction tree, and we show the utility of the post-processing framework by analyzing two groups of genes in yeast — the protein chaperones and a set of putative targets of the Nrg1 and Nrg2 transcription factors — and suggesting novel hypotheses about their transcriptional and post-transcriptional regulation. Detailed results and Robust GeneClass source code is available for download from
Reconstruction of Network Evolutionary History from Extant Network Topology and Duplication History
Genome-wide protein-protein interaction (PPI) data are readily available
thanks to recent breakthroughs in biotechnology. However, PPI networks of
extant organisms are only snapshots of the network evolution. How to infer the
whole evolution history becomes a challenging problem in computational biology.
In this paper, we present a likelihood-based approach to inferring network
evolution history from the topology of PPI networks and the duplication
relationship among the paralogs. Simulations show that our approach outperforms
the existing ones in terms of the accuracy of reconstruction. Moreover, the
growth parameters of several real PPI networks estimated by our method are more
consistent with the ones predicted in literature.Comment: 15 pages, 5 figures, submitted to ISBRA 201
Network Archaeology: Uncovering Ancient Networks from Present-day Interactions
Often questions arise about old or extinct networks. What proteins interacted
in a long-extinct ancestor species of yeast? Who were the central players in
the Last.fm social network 3 years ago? Our ability to answer such questions
has been limited by the unavailability of past versions of networks. To
overcome these limitations, we propose several algorithms for reconstructing a
network's history of growth given only the network as it exists today and a
generative model by which the network is believed to have evolved. Our
likelihood-based method finds a probable previous state of the network by
reversing the forward growth model. This approach retains node identities so
that the history of individual nodes can be tracked. We apply these algorithms
to uncover older, non-extant biological and social networks believed to have
grown via several models, including duplication-mutation with complementarity,
forest fire, and preferential attachment. Through experiments on both synthetic
and real-world data, we find that our algorithms can estimate node arrival
times, identify anchor nodes from which new nodes copy links, and can reveal
significant features of networks that have long since disappeared.Comment: 16 pages, 10 figure
Propiedades puzolánicas de desechos de la industria azucarera (segunda parte)
Results of studies conducted in lime pozzolana pastes are introduced in this paper. The pozzolana was sourced from Sugar Cane Baggasse Ash (CBC) and Sugar Cane Straw Ash (CPC). The hydration of this binder was carefully monitored by means of instrumental techniques with the aim of describing the kinetics of the reaction.Se presentan los resultados de estudios realizados en pastas fabricadas con un aglomerante del tipo calpuzolana, utilizando cenizas de paja de caña (CPC) y cenizas de bagazo de caña (CBC) como puzolanas. Se muestran los estudios que caracterizan la reacción de hidratación desarrollada en este tipo de aglomerante. La evaluación realizada, utilizando diferentes técnicas instrumentales, permitieron estudiar y comprender la cinética de la reacción
Automatic Network Fingerprinting through Single-Node Motifs
Complex networks have been characterised by their specific connectivity
patterns (network motifs), but their building blocks can also be identified and
described by node-motifs---a combination of local network features. One
technique to identify single node-motifs has been presented by Costa et al. (L.
D. F. Costa, F. A. Rodrigues, C. C. Hilgetag, and M. Kaiser, Europhys. Lett.,
87, 1, 2009). Here, we first suggest improvements to the method including how
its parameters can be determined automatically. Such automatic routines make
high-throughput studies of many networks feasible. Second, the new routines are
validated in different network-series. Third, we provide an example of how the
method can be used to analyse network time-series. In conclusion, we provide a
robust method for systematically discovering and classifying characteristic
nodes of a network. In contrast to classical motif analysis, our approach can
identify individual components (here: nodes) that are specific to a network.
Such special nodes, as hubs before, might be found to play critical roles in
real-world networks.Comment: 16 pages (4 figures) plus supporting information 8 pages (5 figures
Adaptive structure tensors and their applications
The structure tensor, also known as second moment matrix or Förstner interest operator, is a very popular tool in image processing. Its purpose is the estimation of orientation and the local analysis of structure in general. It is based on the integration of data from a local neighborhood. Normally, this neighborhood is defined by a Gaussian window function and the structure tensor is computed by the weighted sum within this window. Some recently proposed methods, however, adapt the computation of the structure tensor to the image data. There are several ways how to do that. This article wants to give an overview of the different approaches, whereas the focus lies on the methods based on robust statistics and nonlinear diffusion. Furthermore, the dataadaptive structure tensors are evaluated in some applications. Here the main focus lies on optic flow estimation, but also texture analysis and corner detection are considered
Modeling and verifying a broad array of network properties
Motivated by widely observed examples in nature, society and software, where
groups of already related nodes arrive together and attach to an existing
network, we consider network growth via sequential attachment of linked node
groups, or graphlets. We analyze the simplest case, attachment of the three
node V-graphlet, where, with probability alpha, we attach a peripheral node of
the graphlet, and with probability (1-alpha), we attach the central node. Our
analytical results and simulations show that tuning alpha produces a wide range
in degree distribution and degree assortativity, achieving assortativity values
that capture a diverse set of many real-world systems. We introduce a
fifteen-dimensional attribute vector derived from seven well-known network
properties, which enables comprehensive comparison between any two networks.
Principal Component Analysis (PCA) of this attribute vector space shows a
significantly larger coverage potential of real-world network properties by a
simple extension of the above model when compared against a classic model of
network growth.Comment: To appear in Europhysics Letter
Calibration of the Logarithmic-Periodic Dipole Antenna (LPDA) Radio Stations at the Pierre Auger Observatory using an Octocopter
An in-situ calibration of a logarithmic periodic dipole antenna with a
frequency coverage of 30 MHz to 80 MHz is performed. Such antennas are part of
a radio station system used for detection of cosmic ray induced air showers at
the Engineering Radio Array of the Pierre Auger Observatory, the so-called
Auger Engineering Radio Array (AERA). The directional and frequency
characteristics of the broadband antenna are investigated using a remotely
piloted aircraft (RPA) carrying a small transmitting antenna. The antenna
sensitivity is described by the vector effective length relating the measured
voltage with the electric-field components perpendicular to the incoming signal
direction. The horizontal and meridional components are determined with an
overall uncertainty of 7.4^{+0.9}_{-0.3} % and 10.3^{+2.8}_{-1.7} %
respectively. The measurement is used to correct a simulated response of the
frequency and directional response of the antenna. In addition, the influence
of the ground conductivity and permittivity on the antenna response is
simulated. Both have a negligible influence given the ground conditions
measured at the detector site. The overall uncertainties of the vector
effective length components result in an uncertainty of 8.8^{+2.1}_{-1.3} % in
the square root of the energy fluence for incoming signal directions with
zenith angles smaller than 60{\deg}.Comment: Published version. Updated online abstract only. Manuscript is
unchanged with respect to v2. 39 pages, 15 figures, 2 table
Multi-resolution anisotropy studies of ultrahigh-energy cosmic rays detected at the Pierre Auger Observatory
We report a multi-resolution search for anisotropies in the arrival
directions of cosmic rays detected at the Pierre Auger Observatory with local
zenith angles up to and energies in excess of 4 EeV ( eV). This search is conducted by measuring the angular power spectrum
and performing a needlet wavelet analysis in two independent energy ranges.
Both analyses are complementary since the angular power spectrum achieves a
better performance in identifying large-scale patterns while the needlet
wavelet analysis, considering the parameters used in this work, presents a
higher efficiency in detecting smaller-scale anisotropies, potentially
providing directional information on any observed anisotropies. No deviation
from isotropy is observed on any angular scale in the energy range between 4
and 8 EeV. Above 8 EeV, an indication for a dipole moment is captured; while no
other deviation from isotropy is observed for moments beyond the dipole one.
The corresponding -values obtained after accounting for searches blindly
performed at several angular scales, are in the case of
the angular power spectrum, and in the case of the needlet
analysis. While these results are consistent with previous reports making use
of the same data set, they provide extensions of the previous works through the
thorough scans of the angular scales.Comment: Published version. Added journal reference and DOI. Added Report
Numbe
- …