325 research outputs found

    A classification-based framework for predicting and analyzing gene regulatory response

    Get PDF
    BACKGROUND: We have recently introduced a predictive framework for studying gene transcriptional regulation in simpler organisms using a novel supervised learning algorithm called GeneClass. GeneClass is motivated by the hypothesis that in model organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular microarray experiment based on the presence of binding site subsequences ("motifs") in the gene's regulatory region and the expression levels of regulators such as transcription factors in the experiment ("parents"). GeneClass formulates the learning task as a classification problem — predicting +1 and -1 labels corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. Using the Adaboost algorithm, GeneClass learns a prediction function in the form of an alternating decision tree, a margin-based generalization of a decision tree. METHODS: In the current work, we introduce a new, robust version of the GeneClass algorithm that increases stability and computational efficiency, yielding a more scalable and reliable predictive model. The improved stability of the prediction tree enables us to introduce a detailed post-processing framework for biological interpretation, including individual and group target gene analysis to reveal condition-specific regulation programs and to suggest signaling pathways. Robust GeneClass uses a novel stabilized variant of boosting that allows a set of correlated features, rather than single features, to be included at nodes of the tree; in this way, biologically important features that are correlated with the single best feature are retained rather than decorrelated and lost in the next round of boosting. Other computational developments include fast matrix computation of the loss function for all features, allowing scalability to large datasets, and the use of abstaining weak rules, which results in a more shallow and interpretable tree. We also show how to incorporate genome-wide protein-DNA binding data from ChIP chip experiments into the GeneClass algorithm, and we use an improved noise model for gene expression data. RESULTS: Using the improved scalability of Robust GeneClass, we present larger scale experiments on a yeast environmental stress dataset, training and testing on all genes and using a comprehensive set of potential regulators. We demonstrate the improved stability of the features in the learned prediction tree, and we show the utility of the post-processing framework by analyzing two groups of genes in yeast — the protein chaperones and a set of putative targets of the Nrg1 and Nrg2 transcription factors — and suggesting novel hypotheses about their transcriptional and post-transcriptional regulation. Detailed results and Robust GeneClass source code is available for download from

    Reconstruction of Network Evolutionary History from Extant Network Topology and Duplication History

    Full text link
    Genome-wide protein-protein interaction (PPI) data are readily available thanks to recent breakthroughs in biotechnology. However, PPI networks of extant organisms are only snapshots of the network evolution. How to infer the whole evolution history becomes a challenging problem in computational biology. In this paper, we present a likelihood-based approach to inferring network evolution history from the topology of PPI networks and the duplication relationship among the paralogs. Simulations show that our approach outperforms the existing ones in terms of the accuracy of reconstruction. Moreover, the growth parameters of several real PPI networks estimated by our method are more consistent with the ones predicted in literature.Comment: 15 pages, 5 figures, submitted to ISBRA 201

    Network Archaeology: Uncovering Ancient Networks from Present-day Interactions

    Get PDF
    Often questions arise about old or extinct networks. What proteins interacted in a long-extinct ancestor species of yeast? Who were the central players in the Last.fm social network 3 years ago? Our ability to answer such questions has been limited by the unavailability of past versions of networks. To overcome these limitations, we propose several algorithms for reconstructing a network's history of growth given only the network as it exists today and a generative model by which the network is believed to have evolved. Our likelihood-based method finds a probable previous state of the network by reversing the forward growth model. This approach retains node identities so that the history of individual nodes can be tracked. We apply these algorithms to uncover older, non-extant biological and social networks believed to have grown via several models, including duplication-mutation with complementarity, forest fire, and preferential attachment. Through experiments on both synthetic and real-world data, we find that our algorithms can estimate node arrival times, identify anchor nodes from which new nodes copy links, and can reveal significant features of networks that have long since disappeared.Comment: 16 pages, 10 figure

    Propiedades puzolánicas de desechos de la industria azucarera (segunda parte)

    Get PDF
    Results of studies conducted in lime pozzolana pastes are introduced in this paper. The pozzolana was sourced from Sugar Cane Baggasse Ash (CBC) and Sugar Cane Straw Ash (CPC). The hydration of this binder was carefully monitored by means of instrumental techniques with the aim of describing the kinetics of the reaction.Se presentan los resultados de estudios realizados en pastas fabricadas con un aglomerante del tipo calpuzolana, utilizando cenizas de paja de caña (CPC) y cenizas de bagazo de caña (CBC) como puzolanas. Se muestran los estudios que caracterizan la reacción de hidratación desarrollada en este tipo de aglomerante. La evaluación realizada, utilizando diferentes técnicas instrumentales, permitieron estudiar y comprender la cinética de la reacción

    Automatic Network Fingerprinting through Single-Node Motifs

    Get PDF
    Complex networks have been characterised by their specific connectivity patterns (network motifs), but their building blocks can also be identified and described by node-motifs---a combination of local network features. One technique to identify single node-motifs has been presented by Costa et al. (L. D. F. Costa, F. A. Rodrigues, C. C. Hilgetag, and M. Kaiser, Europhys. Lett., 87, 1, 2009). Here, we first suggest improvements to the method including how its parameters can be determined automatically. Such automatic routines make high-throughput studies of many networks feasible. Second, the new routines are validated in different network-series. Third, we provide an example of how the method can be used to analyse network time-series. In conclusion, we provide a robust method for systematically discovering and classifying characteristic nodes of a network. In contrast to classical motif analysis, our approach can identify individual components (here: nodes) that are specific to a network. Such special nodes, as hubs before, might be found to play critical roles in real-world networks.Comment: 16 pages (4 figures) plus supporting information 8 pages (5 figures

    Adaptive structure tensors and their applications

    Get PDF
    The structure tensor, also known as second moment matrix or Förstner interest operator, is a very popular tool in image processing. Its purpose is the estimation of orientation and the local analysis of structure in general. It is based on the integration of data from a local neighborhood. Normally, this neighborhood is defined by a Gaussian window function and the structure tensor is computed by the weighted sum within this window. Some recently proposed methods, however, adapt the computation of the structure tensor to the image data. There are several ways how to do that. This article wants to give an overview of the different approaches, whereas the focus lies on the methods based on robust statistics and nonlinear diffusion. Furthermore, the dataadaptive structure tensors are evaluated in some applications. Here the main focus lies on optic flow estimation, but also texture analysis and corner detection are considered

    Modeling and verifying a broad array of network properties

    Full text link
    Motivated by widely observed examples in nature, society and software, where groups of already related nodes arrive together and attach to an existing network, we consider network growth via sequential attachment of linked node groups, or graphlets. We analyze the simplest case, attachment of the three node V-graphlet, where, with probability alpha, we attach a peripheral node of the graphlet, and with probability (1-alpha), we attach the central node. Our analytical results and simulations show that tuning alpha produces a wide range in degree distribution and degree assortativity, achieving assortativity values that capture a diverse set of many real-world systems. We introduce a fifteen-dimensional attribute vector derived from seven well-known network properties, which enables comprehensive comparison between any two networks. Principal Component Analysis (PCA) of this attribute vector space shows a significantly larger coverage potential of real-world network properties by a simple extension of the above model when compared against a classic model of network growth.Comment: To appear in Europhysics Letter

    Calibration of the Logarithmic-Periodic Dipole Antenna (LPDA) Radio Stations at the Pierre Auger Observatory using an Octocopter

    Get PDF
    An in-situ calibration of a logarithmic periodic dipole antenna with a frequency coverage of 30 MHz to 80 MHz is performed. Such antennas are part of a radio station system used for detection of cosmic ray induced air showers at the Engineering Radio Array of the Pierre Auger Observatory, the so-called Auger Engineering Radio Array (AERA). The directional and frequency characteristics of the broadband antenna are investigated using a remotely piloted aircraft (RPA) carrying a small transmitting antenna. The antenna sensitivity is described by the vector effective length relating the measured voltage with the electric-field components perpendicular to the incoming signal direction. The horizontal and meridional components are determined with an overall uncertainty of 7.4^{+0.9}_{-0.3} % and 10.3^{+2.8}_{-1.7} % respectively. The measurement is used to correct a simulated response of the frequency and directional response of the antenna. In addition, the influence of the ground conductivity and permittivity on the antenna response is simulated. Both have a negligible influence given the ground conditions measured at the detector site. The overall uncertainties of the vector effective length components result in an uncertainty of 8.8^{+2.1}_{-1.3} % in the square root of the energy fluence for incoming signal directions with zenith angles smaller than 60{\deg}.Comment: Published version. Updated online abstract only. Manuscript is unchanged with respect to v2. 39 pages, 15 figures, 2 table

    Multi-resolution anisotropy studies of ultrahigh-energy cosmic rays detected at the Pierre Auger Observatory

    Get PDF
    We report a multi-resolution search for anisotropies in the arrival directions of cosmic rays detected at the Pierre Auger Observatory with local zenith angles up to 8080^\circ and energies in excess of 4 EeV (4×10184 \times 10^{18} eV). This search is conducted by measuring the angular power spectrum and performing a needlet wavelet analysis in two independent energy ranges. Both analyses are complementary since the angular power spectrum achieves a better performance in identifying large-scale patterns while the needlet wavelet analysis, considering the parameters used in this work, presents a higher efficiency in detecting smaller-scale anisotropies, potentially providing directional information on any observed anisotropies. No deviation from isotropy is observed on any angular scale in the energy range between 4 and 8 EeV. Above 8 EeV, an indication for a dipole moment is captured; while no other deviation from isotropy is observed for moments beyond the dipole one. The corresponding pp-values obtained after accounting for searches blindly performed at several angular scales, are 1.3×1051.3 \times 10^{-5} in the case of the angular power spectrum, and 2.5×1032.5 \times 10^{-3} in the case of the needlet analysis. While these results are consistent with previous reports making use of the same data set, they provide extensions of the previous works through the thorough scans of the angular scales.Comment: Published version. Added journal reference and DOI. Added Report Numbe
    corecore