343 research outputs found

    EVOLUTION OF THE BUTTERFLY COORDINATION IN RELATION TO VELOCITY AND SKILL LEVEL OF SWIMMERS

    Get PDF
    The purpose of this study was to identify stroke phases and arm and leg coordination during butterfly swimming as a function of swim velocity and performance level. Twenty four swimmers constituted two groups based on performance level. All swam at three different velocities, corresponding to the appropriate paces for respectively the 400 m, 100 m and 50 m. The different stroke phases and the arm and leg coordination were identified by video analysis. The coordination was studied by the temporal gap analysis separating the changes of arm and leg movements. The most important results showed that expert swimmers are characterised by their capacity to control and adapt their coordination with an increase in velocity, contrary to non expert characterised by lag times into movement of arms to place their legs actions

    Stability and aggregation of ranked gene lists

    Get PDF
    Ranked gene lists are highly instable in the sense that similar measures of differential gene expression may yield very different rankings, and that a small change of the data set usually affects the obtained gene list considerably. Stability issues have long been under-considered in the literature, but they have grown to a hot topic in the last few years, perhaps as a consequence of the increasing skepticism on the reproducibility and clinical applicability of molecular research findings. In this article, we review existing approaches for the assessment of stability of ranked gene lists and the related problem of aggregation, give some practical recommendations, and warn against potential misuse of these methods. This overview is illustrated through an application to a recent leukemia data set using the freely available Bioconductor package GeneSelector

    CMA ā€“ a comprehensive Bioconductor package for supervised classification with high dimensional data

    Get PDF
    For the last eight years, microarray-based class prediction has been a major topic in statistics, bioinformatics and biomedicine research. Traditional methods often yield unsatisfactory results or may even be inapplicable in the p > n setting where the number of predictors by far exceeds the number of observations, hence the term ā€œill-posed-problemā€. Careful model selection and evaluation satisfying accepted good-practice standards is a very complex task for inexperienced users with limited statistical background or for statisticians without experience in this area. The multiplicity of available methods for class prediction based on high-dimensional data is an additional practical challenge for inexperienced researchers. In this article, we introduce a new Bioconductor package called CMA (standing for ā€œClassification for MicroArraysā€) for automatically performing variable selection, parameter tuning, classifier construction, and unbiased evaluation of the constructed classifiers using a large number of usual methods. Without much time and effort, users are provided with an overview of the unbiased accuracy of most top-performing classifiers. Furthermore, the standardized evaluation framework underlying CMA can also be beneficial in statistical research for comparison purposes, for instance if a new classifier has to be compared to existing approaches. CMA is a user-friendly comprehensive package for classifier construction and evaluation implementing most usual approaches. It is freely available from the Bioconductor website at http://bioconductor.org/packages/2.3/bioc/html/CMA.html

    Evaluating Microarray-based Classifiers: An Overview

    Get PDF
    For the last eight years, microarray-based class prediction has been the subject of numerous publications in medicine, bioinformatics and statistics journals. However, in many articles, the assessment of classification accuracy is carried out using suboptimal procedures and is not paid much attention. In this paper, we carefully review various statistical aspects of classifier evaluation and validation from a practical point of view. The main topics addressed are accuracy measures, error rate estimation procedures, variable selection, choice of classifiers and validation strategy

    Restoring the full velocity field in the gaseous disk ofthe spiral galaxy NGC 157

    Get PDF
    We analyse the line-of-sight velocity field of ionized gas in the spiral galaxy NGC 157 which has been obtained in the H\alpha emission at the 6m telescope of SAO RAS. The existence of systematic deviations of the observed gas velocities from pure circular motion is shown. A detailed investigation of these deviations is undertaken by applying a Fourier analysis of the azimuthal distributions of the line-of-sight velocities at different distances from the galactic center. As a result of the analysis, all the main parameters of the wave spiral pattern are determined: the corotation radius, the amplitudes and phases of the gas velocity perturbations at different radii, and the velocity of circular rotation of the disk corrected for the velocity perturbations due to spiral arms. At a high confidence level, the presence of the two giant anticyclones in the reference frame rotating with the spiral pattern is shown; their sizes and the localization of their centers are consistent with the results of the analytic theory and of numerical simulations. Besides the anticyclones, the existence of cyclones in residual velocity fields of spiral galaxies is predicted. In the reference frame rotating with the spiral pattern these cyclones have to reveal themselves in galaxies where a radial gradient of azimuthal residual velocity is steeper than that of the rotation velocity (abridged).Comment: 23 pages including 25 eps-figures. Accepted for publication in A&

    Over-optimism in unsupervised microbiome analysis: Insights from network learning and clustering

    Get PDF
    In recent years, unsupervised analysis of microbiome data, such as microbial network analysis and clustering, has increased in popularity. Many new statistical and computational methods have been proposed for these tasks. This multiplicity of analysis strategies poses a challenge for researchers, who are often unsure which method(s) to use and might be tempted to try different methods on their dataset to look for the ā€œbestā€ ones. However, if only the best results are selectively reported, this may cause over-optimism: the ā€œbestā€ method is overly fitted to the specific dataset, and the results might be non-replicable on validation data. Such effects will ultimately hinder research progress. Yet so far, these topics have been given little attention in the context of unsupervised microbiome analysis. In our illustrative study, we aim to quantify over-optimism effects in this context. We model the approach of a hypothetical microbiome researcher who undertakes four unsupervised research tasks: clustering of bacterial genera, hub detection in microbial networks, differential microbial network analysis, and clustering of samples. While these tasks are unsupervised, the researcher might still have certain expectations as to what constitutes interesting results. We translate these expectations into concrete evaluation criteria that the hypothetical researcher might want to optimize. We then randomly split an exemplary dataset from the American Gut Project into discovery and validation sets multiple times. For each research task, multiple method combinations (e.g., methods for data normalization, network generation, and/or clustering) are tried on the discovery data, and the combination that yields the best result according to the evaluation criterion is chosen. While the hypothetical researcher might only report this result, we also apply the ā€œbestā€ method combination to the validation dataset. The results are then compared between discovery and validation data. In all four research tasks, there are notable over-optimism effects; the results on the validation data set are worse compared to the discovery data, averaged over multiple random splits into discovery/validation data. Our study thus highlights the importance of validation and replication in microbiome analysis to obtain reliable results and demonstrates that the issue of over-optimism goes beyond the context of statistical testing and fishing for significance

    An AUC-based Permutation Variable Importance Measure for Random Forests

    Get PDF
    The random forest (RF) method is a commonly used tool for classification with high dimensional data as well as for ranking candidate predictors based on the so-called random forest variable importance measures (VIMs). However the classification performance of RF is known to be suboptimal in case of strongly unbalanced data, i.e. data where response class sizes differ considerably. Suggestions were made to obtain better classification performance based either on sampling procedures or on cost sensitivity analyses. However to our knowledge the performance of the VIMs has not yet been examined in the case of unbalanced response classes. In this paper we explore the performance of the permutation VIM for unbalanced data settings and introduce an alternative permutation VIM based on the area under the curve (AUC) that is expected to be more robust towards class imbalance. We investigated the performance of the standard permutation VIM and of our novel AUC-based permutation VIM for different class imbalance levels using simulated data and real data. The results suggest that the standard permutation VIM loses its ability to discriminate between associated predictors and predictors not associated with the response for increasing class imbalance. It is outperformed by our new AUC-based permutation VIM for unbalanced data settings, while the performance of both VIMs is very similar in the case of balanced classes. The new AUC-based VIM is implemented in the R package party for the unbiased RF variant based on conditional inference trees. The codes implementing our study are available from the companion website: http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/070_drittmittel/janitza/index.html

    The Compact Group of Galaxies HCG 31 is in an early phase of merging

    Full text link
    We have obtained high spectral resolution (R = 45900) Fabry-Perot velocity maps of the Hickson Compact Group HCG 31 in order to revisit the important problem of the merger nature of the central object A+C and to derive the internal kinematics of the candidate tidal dwarf galaxies in this group. Our main findings are: (1) double kinematic components are present throughout the main body of A+C, which strongly suggests that this complex is an ongoing merger (2) regions A2A2 and E, to the east and south of complex A+C, present rotation patterns with velocity amplitudes of āˆ¼25kmsāˆ’1\sim 25 km s^{-1} and they counterrotate with respect to A+C, (3) region F, which was previously thought to be the best example of a tidal dwarf galaxy in HCG 31, presents no rotation and negligible internal velocity dispersion, as is also the case for region A1A1. HCG 31 presents an undergoing merger in its center (A+C) and it is likely that it has suffered additional perturbations due to interactions with the nearby galaxies B, G and Q.Comment: 5 pages + figures - Accepted to ApJ Lette

    NetCoMi: network construction and comparison for microbiome data in R

    Get PDF
    MOTIVATION Estimating microbial association networks from high-throughput sequencing data is a common exploratory data analysis approach aiming at understanding the complex interplay of microbial communities in their natural habitat. Statistical network estimation workflows comprise several analysis steps, including methods for zero handling, data normalization and computing microbial associations. Since microbial interactions are likely to change between conditions, e.g. between healthy individuals and patients, identifying network differences between groups is often an integral secondary analysis step. Thus far, however, no unifying computational tool is available that facilitates the whole analysis workflow of constructing, analysing and comparing microbial association networks from high-throughput sequencing data. RESULTS Here, we introduce NetCoMi (Network Construction and comparison for Microbiome data), an R package that integrates existing methods for each analysis step in a single reproducible computational workflow. The package offers functionality for constructing and analysing single microbial association networks as well as quantifying network differences. This enables insights into whether single taxa, groups of taxa or the overall network structure change between groups. NetCoMi also contains functionality for constructing differential networks, thus allowing to assess whether single pairs of taxa are differentially associated between two groups. Furthermore, NetCoMi facilitates the construction and analysis of dissimilarity networks of microbiome samples, enabling a high-level graphical summary of the heterogeneity of an entire microbiome sample collection. We illustrate NetCoMi's wide applicability using data sets from the GABRIELA study to compare microbial associations in settled dust from children's rooms between samples from two study centers (Ulm and Munich). AVAILABILITY R scripts used for producing the examples shown in this manuscript are provided as supplementary data. The NetCoMi package, together with a tutorial, is available at https://github.com/stefpeschel/NetCoMi. CONTACT Tel:+49 89 3187 43258; [email protected]. SUPPLEMENTARY INFORMATION Supplementary data are available at Briefings in Bioinformatics online
    • ā€¦
    corecore