73 research outputs found
Central subspaces review: methods and applications
Central subspaces have long been a key concept for sufficient dimension reduction. Initially constructed for solving problems in the p n . In this article we review the theory of central subspaces and give an updated overview of central subspace methods for the p ≤ n , p > n and big data settings. We also develop a new classification system for these techniques and list some R and MATLAB packages that can be used for estimating the central subspace. Finally, we develop a central subspace framework for bioinformatics applications and show, using two distinct data sets, how this framework can be applied in practice
ClustOfVar: An R Package for the Clustering of Variables
Clustering of variables is as a way to arrange variables into homogeneous
clusters, i.e., groups of variables which are strongly related to each other
and thus bring the same information. These approaches can then be useful for
dimension reduction and variable selection. Several specific methods have been
developed for the clustering of numerical variables. However concerning
qualitative variables or mixtures of quantitative and qualitative variables,
far fewer methods have been proposed. The R package ClustOfVar was specifically
developed for this purpose. The homogeneity criterion of a cluster is defined
as the sum of correlation ratios (for qualitative variables) and squared
correlations (for quantitative variables) to a synthetic quantitative variable,
summarizing "as good as possible" the variables in the cluster. This synthetic
variable is the first principal component obtained with the PCAMIX method. Two
algorithms for the clustering of variables are proposed: iterative relocation
algorithm and ascendant hierarchical clustering. We also propose a bootstrap
approach in order to determine suitable numbers of clusters. We illustrate the
methodologies and the associated package on small datasets
Organic and inorganic nitrogen amendments reduce biodegradation of biodegradable plastic mulch films
Biodegradable mulch films (BDMs) are a sustainable and promising
alternative to non-biodegradable polyethylene mulches used in crop
production systems. Nitrogen amendments in the form of fertilizers are used
by growers to enhance soil and plant-available nutrients; however, there is
limited research on how these additions impact the biodegradation of BDMs tilled
into soils. A 4-month laboratory incubation study using soil microcosms
was used to investigate the effects of inorganic (ammonium nitrate) and
organic (urea and amino acids) nitrogen application on biodegradation of
BDMs. We investigated the response of soil bacterial, fungal, and
ammonia-oxidizing microbial abundance along with soil nitrogen pools and
enzyme activities. Microcosms were comprised of soils from two diverse
climates (Knoxville, TN, USA, and Mount Vernon, WA, USA) and BioAgri, a
biodegradable mulch film made of Mater-Bi®, a bioplastic raw
material containing starch and poly(butylene adipate-co-terephthalate)
(PBAT). Both organic and inorganic nitrogen amendments inhibited mulch
biodegradation, soil bacterial abundances, and enzyme activities. The
greatest inhibition of mulch biodegradation in TN soils was observed with
urea amendment where biodegradation was reduced by about 6 % compared to
the no-nitrogen control. In WA soils, all nitrogen amendments suppressed
biodegradation by about 1 % compared to the no-nitrogen control. Ammonia
monooxygenase amoA gene abundances were increased in TN soils in all treatments
but reduced for all treatments in WA soils. However, a significantly higher
nitrate concentration and a lower ammonium concentration were seen for all nitrogen
treatments compared to no-nitrogen controls in both TN and WA. This study
suggests that the addition of nitrogen, particularly inorganic amendments, could
slow down mulch biodegradation but that mulch biodegradation does not
negatively affect soil nitrification activity.</p
Community evaluation of glycoproteomics informatics solutions reveals high-performance search strategies for serum glycopeptide analysis
Glycoproteomics is a powerful yet analytically challenging research tool. Software packages aiding the interpretation of complex glycopeptide tandem mass spectra have appeared, but their relative performance remains untested. Conducted through the HUPO Human Glycoproteomics Initiative, this community study, comprising both developers and users of glycoproteomics software, evaluates solutions for system-wide glycopeptide analysis. The same mass spectrometrybased glycoproteomics datasets from human serum were shared with participants and the relative team performance for N- and O-glycopeptide data analysis was comprehensively established by orthogonal performance tests. Although the results were variable, several high-performance glycoproteomics informatics strategies were identified. Deep analysis of the data revealed key performance-associated search parameters and led to recommendations for improved 'high-coverage' and 'high-accuracy' glycoproteomics search solutions. This study concludes that diverse software packages for comprehensive glycopeptide data analysis exist, points to several high-performance search strategies and specifies key variables that will guide future software developments and assist informatics decision-making in glycoproteomics
A Survey of Bayesian Statistical Approaches for Big Data
The modern era is characterised as an era of information or Big Data. This
has motivated a huge literature on new methods for extracting information and
insights from these data. A natural question is how these approaches differ
from those that were available prior to the advent of Big Data. We present a
review of published studies that present Bayesian statistical approaches
specifically for Big Data and discuss the reported and perceived benefits of
these approaches. We conclude by addressing the question of whether focusing
only on improving computational algorithms and infrastructure will be enough to
face the challenges of Big Data
- …