73 research outputs found

    Central subspaces review: methods and applications

    Get PDF
    Central subspaces have long been a key concept for sufficient dimension reduction. Initially constructed for solving problems in the p n . In this article we review the theory of central subspaces and give an updated overview of central subspace methods for the p ≤ n , p > n and big data settings. We also develop a new classification system for these techniques and list some R and MATLAB packages that can be used for estimating the central subspace. Finally, we develop a central subspace framework for bioinformatics applications and show, using two distinct data sets, how this framework can be applied in practice

    ClustOfVar: An R Package for the Clustering of Variables

    Get PDF
    Clustering of variables is as a way to arrange variables into homogeneous clusters, i.e., groups of variables which are strongly related to each other and thus bring the same information. These approaches can then be useful for dimension reduction and variable selection. Several specific methods have been developed for the clustering of numerical variables. However concerning qualitative variables or mixtures of quantitative and qualitative variables, far fewer methods have been proposed. The R package ClustOfVar was specifically developed for this purpose. The homogeneity criterion of a cluster is defined as the sum of correlation ratios (for qualitative variables) and squared correlations (for quantitative variables) to a synthetic quantitative variable, summarizing "as good as possible" the variables in the cluster. This synthetic variable is the first principal component obtained with the PCAMIX method. Two algorithms for the clustering of variables are proposed: iterative relocation algorithm and ascendant hierarchical clustering. We also propose a bootstrap approach in order to determine suitable numbers of clusters. We illustrate the methodologies and the associated package on small datasets

    Organic and inorganic nitrogen amendments reduce biodegradation of biodegradable plastic mulch films

    Get PDF
    Biodegradable mulch films (BDMs) are a sustainable and promising alternative to non-biodegradable polyethylene mulches used in crop production systems. Nitrogen amendments in the form of fertilizers are used by growers to enhance soil and plant-available nutrients; however, there is limited research on how these additions impact the biodegradation of BDMs tilled into soils. A 4-month laboratory incubation study using soil microcosms was used to investigate the effects of inorganic (ammonium nitrate) and organic (urea and amino acids) nitrogen application on biodegradation of BDMs. We investigated the response of soil bacterial, fungal, and ammonia-oxidizing microbial abundance along with soil nitrogen pools and enzyme activities. Microcosms were comprised of soils from two diverse climates (Knoxville, TN, USA, and Mount Vernon, WA, USA) and BioAgri, a biodegradable mulch film made of Mater-Bi®, a bioplastic raw material containing starch and poly(butylene adipate-co-terephthalate) (PBAT). Both organic and inorganic nitrogen amendments inhibited mulch biodegradation, soil bacterial abundances, and enzyme activities. The greatest inhibition of mulch biodegradation in TN soils was observed with urea amendment where biodegradation was reduced by about 6 % compared to the no-nitrogen control. In WA soils, all nitrogen amendments suppressed biodegradation by about 1 % compared to the no-nitrogen control. Ammonia monooxygenase amoA gene abundances were increased in TN soils in all treatments but reduced for all treatments in WA soils. However, a significantly higher nitrate concentration and a lower ammonium concentration were seen for all nitrogen treatments compared to no-nitrogen controls in both TN and WA. This study suggests that the addition of nitrogen, particularly inorganic amendments, could slow down mulch biodegradation but that mulch biodegradation does not negatively affect soil nitrification activity.</p

    Community evaluation of glycoproteomics informatics solutions reveals high-performance search strategies for serum glycopeptide analysis

    Get PDF
    Glycoproteomics is a powerful yet analytically challenging research tool. Software packages aiding the interpretation of complex glycopeptide tandem mass spectra have appeared, but their relative performance remains untested. Conducted through the HUPO Human Glycoproteomics Initiative, this community study, comprising both developers and users of glycoproteomics software, evaluates solutions for system-wide glycopeptide analysis. The same mass spectrometrybased glycoproteomics datasets from human serum were shared with participants and the relative team performance for N- and O-glycopeptide data analysis was comprehensively established by orthogonal performance tests. Although the results were variable, several high-performance glycoproteomics informatics strategies were identified. Deep analysis of the data revealed key performance-associated search parameters and led to recommendations for improved 'high-coverage' and 'high-accuracy' glycoproteomics search solutions. This study concludes that diverse software packages for comprehensive glycopeptide data analysis exist, points to several high-performance search strategies and specifies key variables that will guide future software developments and assist informatics decision-making in glycoproteomics

    A Survey of Bayesian Statistical Approaches for Big Data

    Full text link
    The modern era is characterised as an era of information or Big Data. This has motivated a huge literature on new methods for extracting information and insights from these data. A natural question is how these approaches differ from those that were available prior to the advent of Big Data. We present a review of published studies that present Bayesian statistical approaches specifically for Big Data and discuss the reported and perceived benefits of these approaches. We conclude by addressing the question of whether focusing only on improving computational algorithms and infrastructure will be enough to face the challenges of Big Data
    • …
    corecore