223 research outputs found

    spa: Semi-Supervised Semi-Parametric Graph-Based Estimation in R

    Get PDF
    In this paper, we present an R package that combines feature-based (X) data and graph-based (G) data for prediction of the response Y . In this particular case, Y is observed for a subset of the observations (labeled) and missing for the remainder (unlabeled). We examine an approach for fitting Y = Xò + f(G) where ò is a coefficient vector and f is a function over the vertices of the graph. The procedure is semi-supervised in nature (trained on the labeled and unlabeled sets), requiring iterative algorithms for fitting this estimate. The package provides several key functions for fitting and evaluating an estimator of this type. The package is illustrated on a text analysis data set, where the observations are text documents (papers), the response is the category of paper (either applied or theoretical statistics), the X information is the name of the journal in which the paper resides, and the graph is a co-citation network, with each vertex an observation and each edge the number of times that the two papers cite a common paper. An application involving classification of protein location using a protein interaction graph and an application involving classification on a manifold with part of the feature data converted to a graph are also presented.

    ada: An R Package for Stochastic Boosting

    Get PDF
    Boosting is an iterative algorithm that combines simple classification rules with "mediocre" performance in terms of misclassification error rate to produce a highly accurate classification rule. Stochastic gradient boosting provides an enhancement which incorporates a random mechanism at each boosting step showing an improvement in performance and speed in generating the ensemble. ada is an R package that implements three popular variants of boosting, together with a version of stochastic gradient boosting. In addition, useful plots for data analytic purposes are provided along with an extension to the multi-class case. The algorithms are illustrated with synthetic and real data sets.

    spa: Semi-Supervised Semi-Parametric Graph-Based Estimation in R

    Get PDF
    In this paper, we present an R package that combines feature-based (X) data and graph-based (G) data for prediction of the response Y . In this particular case, Y is observed for a subset of the observations (labeled) and missing for the remainder (unlabeled). We examine an approach for fitting Y = X? + f(G) where ? is a coefficient vector and f is a function over the vertices of the graph. The procedure is semi-supervised in nature (trained on the labeled and unlabeled sets), requiring iterative algorithms for fitting this estimate. The package provides several key functions for fitting and evaluating an estimator of this type. The package is illustrated on a text analysis data set, where the observations are text documents (papers), the response is the category of paper (either applied or theoretical statistics), the X information is the name of the journal in which the paper resides, and the graph is a co-citation network, with each vertex an observation and each edge the number of times that the two papers cite a common paper. An application involving classification of protein location using a protein interaction graph and an application involving classification on a manifold with part of the feature data converted to a graph are also presented

    Predicting whole genome protein interaction networks from primary sequence data in model and non-model organisms using ENTS

    Get PDF
    Background The large-scale identification of physical protein-protein interactions (PPIs) is an important step toward understanding how biological networks evolve and generate emergent phenotypes. However, experimental identification of PPIs is a laborious and error-prone process, and current methods of PPI prediction tend to be highly conservative or require large amounts of functional data that may not be available for newly-sequenced organisms. Results In this study we demonstrate a random-forest based technique, ENTS, for the computational prediction of protein-protein interactions based only on primary sequence data. Our approach is able to efficiently predict interactions on a whole-genome scale for any eukaryotic organism, using pairwise combinations of conserved domains and predicted subcellular localization of proteins as input features. We present the first predicted interactome for the forest tree Populus trichocarpa in addition to the predicted interactomes for Saccharomyces cerevisiae, Homo sapiens, Mus musculus, and Arabidopsis thaliana. Comparing our approach to other PPI predictors, we find that ENTS performs comparably to or better than a number of existing approaches, including several that utilize a variety of functional information for their predictions. We also find that the predicted interactions are biologically meaningful, as indicated by similarity in functional annotations and enrichment of co-expressed genes in public microarray datasets. Furthermore, we demonstrate some of the biological insights that can be gained from these predicted interaction networks. We show that the predicted interactions yield informative groupings of P. trichocarpa metabolic pathways, literature-supported associations among human disease states, and theory-supported insight into the evolutionary dynamics of duplicated genes in paleopolyploid plants. Conclusion We conclude that the ENTS classifier will be a valuable tool for the de novoannotation of genome sequences, providing initial clues about regulatory and metabolic network topology, and revealing relationships that are not immediately obvious from traditional homology-based annotations

    Early parenting intervention aimed at maternal sensitivity and discipline: A process evaluation

    Get PDF
    This study investigated the influence of the intervention process on the effectiveness of a program aimed at promoting positive parenting. The study involved a homogeneous intervention sample (N = 120) of mothers and their 1-, 2-, or 3-year-old children screened for high levels of externalizing problems. The alliance between mother and intervener, mothers' active skills implementation, and father involvement were examined in relation to changes in maternal sensitivity and positive discipline strategies. Results revealed that only alliance predicted change in positive parenting. Implications for future process evaluations and intervention programs are discussed. © 2008 Wiley Periodicals, Inc

    Enohpoxas Tetrauq

    Get PDF
    Cook Hall 212 Thursday Evening April 25, 2002 7:00p.m

    Early parenting intervention: Family risk and first-time parenting related to intervention effectiveness

    Get PDF
    The effects of cumulative risk and parity on the effectiveness of a home based parenting intervention were tested in a randomized controlled trial with 237 families with 1- to 3-year-old children screened for high levels of externalizing behavior. The intervention was aimed at enhancing positive parenting and decreasing externalizing behaviors. The results showed that cumulative risk was not associated with either change in child externalizing behaviors or change in positive parenting. When intervention effectiveness was compared for primiparas (i.e., first-time mothers) versus multiparas (i.e., mothers with more than one child), we found that intervention mothers of first-born children displayed an increase in their use of positive discipline strategies as compared to first-time mothers in the control group, whereas a similar effect for multiparas was absent. Among multiparas we found an intervention effect on sensitivity, with control group mothers showing an increase in sensitivity, whereas the intervention group showed a constant level of sensitivity over time. These results suggest that parity may be a moderator of intervention effectiveness. Implications for investigating moderators of intervention effectiveness are discussed. © 2007 Springer Science+Business Media, LLC
    corecore