868 research outputs found

    Observations - Planet Mars in 1967

    Get PDF
    Visual Mars observations by Volgograd Observatory in 1966-196

    The cooperation of Russian and German metal forming scientific schools to develop the new energy-efficient materials and technologies

    Full text link
    The future scientific orientation of Katedra PDSS is in the area of materials forming and materials development with a focus on efficient processes regarding the use of energy and resources. Current research in the department PDSS is based on fundamental works on thermo-mechanical treatment of metals and on the modeling of nano-materials, rolled material and medical materials. This includes research on the relevant microstructural and macroscopic effects on the materials behavior. Together with its international research partners PDSS has excellent foundations for experimental research as well. With its international focus and its educational programs for students and skilled employees PDSS is an important partner of the Russian metal processing industry which supports Russian companies to compete on a world-class level

    Four basic symmetry types in the universal 7-cluster structure of 143 complete bacterial genomic sequences

    Get PDF
    Coding information is the main source of heterogeneity (non-randomness) in the sequences of bacterial genomes. This information can be naturally modeled by analysing cluster structures in the ``in-phase'' triplet distributions of relatively short genomic fragments (200-400bp). We found a universal 7-cluster structure in all 143 completely sequenced bacterial genomes available in Genbank in August 2004, and explained its properties. The 7-cluster structure is responsible for the main part of sequence heterogeneity in bacterial genomes. In this sense, our 7 clusters is the basic model of bacterial genome sequence. We demonstrated that there are four basic ``pure'' types of this model, observed in nature: ``parallel triangles'', ``perpendicular triangles'', degenerated case and the flower-like type. We show that codon usage of bacterial genomes is a multi-linear function of their genomic G+C-content with high accuracy (more precisely, by two similar functions, one for eubacterial genomes and the other one for archaea). All 143 cluster animated 3D-scatters are collected in a database and is made available on our web-site: http://www.ihes.fr/~zinovyev/7clusters The finding can be readily introduced into any software for gene prediction, sequence alignment or bacterial genomes classification

    Data complexity measured by principal graphs

    Full text link
    How to measure the complexity of a finite set of vectors embedded in a multidimensional space? This is a non-trivial question which can be approached in many different ways. Here we suggest a set of data complexity measures using universal approximators, principal cubic complexes. Principal cubic complexes generalise the notion of principal manifolds for datasets with non-trivial topologies. The type of the principal cubic complex is determined by its dimension and a grammar of elementary graph transformations. The simplest grammar produces principal trees. We introduce three natural types of data complexity: 1) geometric (deviation of the data's approximator from some "idealized" configuration, such as deviation from harmonicity); 2) structural (how many elements of a principal graph are needed to approximate the data), and 3) construction complexity (how many applications of elementary graph transformations are needed to construct the principal object starting from the simplest one). We compute these measures for several simulated and real-life data distributions and show them in the "accuracy-complexity" plots, helping to optimize the accuracy/complexity ratio. We discuss various issues connected with measuring data complexity. Software for computing data complexity measures from principal cubic complexes is provided as well.Comment: Computers and Mathematics with Applications, in pres

    PCA and K-Means decipher genome

    Full text link
    In this paper, we aim to give a tutorial for undergraduate students studying statistical methods and/or bioinformatics. The students will learn how data visualization can help in genomic sequence analysis. Students start with a fragment of genetic text of a bacterial genome and analyze its structure. By means of principal component analysis they ``discover'' that the information in the genome is encoded by non-overlapping triplets. Next, they learn how to find gene positions. This exercise on PCA and K-Means clustering enables active study of the basic bioinformatics notions. Appendix 1 contains program listings that go along with this exercise. Appendix 2 includes 2D PCA plots of triplet usage in moving frame for a series of bacterial genomes from GC-poor to GC-rich ones. Animated 3D PCA plots are attached as separate gif files. Topology (cluster structure) and geometry (mutual positions of clusters) of these plots depends clearly on GC-content.Comment: 18 pages, with program listings for MatLab, PCA analysis of genomes and additional animated 3D PCA plot

    Visualization of Data by Method of Elastic Maps and Its Applications in Genomics, Economics and Sociology

    Get PDF
    Technology of data visualization and data modeling is suggested. The basic of the technology is original idea of elastic net and methods of its construction and application. A short review of relevant methods has been made. The methods proposed are illustrated by applying them to the real economical, sociological and biological datasets and to some model data distributions. The basic of the technology is original idea of elastic net - regular point approximation of some manifold that is put into the multidimensional space and has in a certain sense minimal energy. This manifold is an analogue of principal surface and serves as non-linear screen on what multidimensional data are projected. Remarkable feature of the technology is its ability to work with and to fill gaps in data tables. Gaps are unknown or unreliable values of some features. It gives a possibility to predict plausibly values of unknown features by values of other ones. So it provides technology of constructing different prognosis systems and non-linear regressions. The technology can be used by specialists in different fields. There are several examples of applying the method presented in the end of this paper

    Blind source separation methods for deconvolution of complex signals in cancer biology

    Full text link
    Two blind source separation methods (Independent Component Analysis and Non-negative Matrix Factorization), developed initially for signal processing in engineering, found recently a number of applications in analysis of large-scale data in molecular biology. In this short review, we present the common idea behind these methods, describe ways of implementing and applying them and point out to the advantages compared to more traditional statistical approaches. We focus more specifically on the analysis of gene expression in cancer. The review is finalized by listing available software implementations for the methods described.Comment: Zinovyev A., Kairov U., Karpenyuk T., Ramanculov E. Blind Source Separation Methods For Deconvolution Of Complex Signals In Cancer Biology. 2012. Biochemical and Biophysical Research Communications. In Press. DOI: 10.1016/j.bbrc.2012.12.04
    corecore