1,118 research outputs found

    TICAL - a web-tool for multivariate image clustering and data topology preserving visualization

    Get PDF
    In life science research bioimaging is often used to study two kinds of features in a sample simultaneously: morphology and co-location of molecular components. While bioimaging technology is rapidly proposing and improving new multidimensional imaging platforms, bioimage informatics has to keep pace in order to develop algorithmic approaches to support biology experts in the complex task of data analysis. One particular problem is the availability and applicability of sophisticated image analysis algorithms via the web so different users can apply the same algorithms to their data (sometimes even to the same data to get the same results) and independently from her/his whereabouts and from the technical features of her/his computer. In this paper we describe TICAL, a visual data mining approach to multivariate microscopy analysis which can be applied fully through the web.We describe the algorithmic approach, the software concept and present results obtained for different example images

    Systems Level Modeling of the Cell Cycle Using Budding Yeast

    Get PDF
    Proteins involved in the regulation of the cell cycle are highly conserved across all eukaryotes, and so a relatively simple eukaryote such as yeast can provide insight into a variety of cell cycle perturbations including those that occur in human cancer. To date, the budding yeast Saccharomyces cerevisiae has provided the largest amount of experimental and modeling data on the progression of the cell cycle, making it a logical choice for in-depth studies of this process. Moreover, the advent of methods for collection of high-throughput genome, transcriptome, and proteome data has provided a means to collect and precisely quantify simultaneous cell cycle gene transcript and protein levels, permitting modeling of the cell cycle on the systems level. With the appropriate mathematical framework and sufficient and accurate data on cell cycle components, it should be possible to create a model of the cell cycle that not only effectively describes its operation, but can also predict responses to perturbations such as variation in protein levels and responses to external stimuli including targeted inhibition by drugs. In this review, we summarize existing data on the yeast cell cycle, proteomics technologies for quantifying cell cycle proteins, and the mathematical frameworks that can integrate this data into representative and effective models. Systems level modeling of the cell cycle will require the integration of high-quality data with the appropriate mathematical framework, which can currently be attained through the combination of dynamic modeling based on proteomics data and using yeast as a model organism

    Challenges in the Analysis of Mass-Throughput Data: A Technical Commentary from the Statistical Machine Learning Perspective

    Get PDF
    Sound data analysis is critical to the success of modern molecular medicine research that involves collection and interpretation of mass-throughput data. The novel nature and high-dimensionality in such datasets pose a series of nontrivial data analysis problems. This technical commentary discusses the problems of over-fitting, error estimation, curse of dimensionality, causal versus predictive modeling, integration of heterogeneous types of data, and lack of standard protocols for data analysis. We attempt to shed light on the nature and causes of these problems and to outline viable methodological approaches to overcome them

    Method and System for Identification of Metabolites Using Mass Spectra

    Get PDF
    A method and system is provided for mass spectrometry for identification of a specific elemental formula for an unknown compound which includes but is not limited to a metabolite. The method includes calculating a natural abundance probability (NAP) of a given isotopologue for isotopes of non-labelling elements of an unknown compound. Molecular fragments for a subset of isotopes identified using the NAP are created and sorted into a requisite cache data structure to be subsequently searched. Peaks from raw spectrum data from mass spectrometry for an unknown compound. Sample-specific peaks of the unknown com- pound from various spectral artifacts in ultra-high resolution Fourier transform mass spectra are separated. A set of possible isotope-resolved molecular formula (IMF) are created by iteratively searching the molecular fragment caches and combining with additional isotopes and then statistically filtering the results based on NAP and mass-to-charge (m/2) matching probabilities. An unknown compound is identified and its corresponding elemental molecular formula (EMF) from statistically-significant caches of isotopologues with compatible IMFs
    corecore