16 research outputs found

    Parametric inference of recombination in HIV genomes

    Full text link
    Recombination is an important event in the evolution of HIV. It affects the global spread of the pandemic as well as evolutionary escape from host immune response and from drug therapy within single patients. Comprehensive computational methods are needed for detecting recombinant sequences in large databases, and for inferring the parental sequences. We present a hidden Markov model to annotate a query sequence as a recombinant of a given set of aligned sequences. Parametric inference is used to determine all optimal annotations for all parameters of the model. We show that the inferred annotations recover most features of established hand-curated annotations. Thus, parametric analysis of the hidden Markov model is feasible for HIV full-length genomes, and it improves the detection and annotation of recombinant forms. All computational results, reference alignments, and C++ source code are available at http://bio.math.berkeley.edu/recombination/.Comment: 20 pages, 5 figure

    Исследование задач нахождения оптимального распределения между предприятиями

    Get PDF
    Розглянуто задачу розподілу ресурсів між підприємствами різних галузей у складі одного економічного конгломерату. Наведено різні способи постановки задачі та введення вихідних даних з урахуванням можливості побудови власних функцій віддачі, керувальної дії та часу. Ключовим методом розв’язання задачі є апарат динамічного програмування Беллмана [1]. Досліджено альтернативну формалізацію задачі, у якій фазові і керувальні змінні можуть набувати нескінченної кількості значень, що унеможливлюють застосування стандартних для динамічного програмування таблиць, що призводить до необхідності аналітичних розрахунків. Запропоновано обмеження, які зводять функції віддачі до вигляду, що задовольняє умови виробничих функцій.The problem of the allocation of resources among enterprises from different industries as parts of the economic conglomerate is considered. The different ways of stating the problem and inputting the data, taking into account the possibility of building their own functions of return, control action, and time are presented. The key method of solving the problem is the dynamic programming [1]. Also, we investigated an alternative formalization of the problem in which the phase and control variables may take an infinite number of values, which made it impossible to use standard tables for dynamic programming and lead to analytical calculations. In the latter case, we provide a number of limitations, which convert the function of return to the form that satisfies the conditions of production functions.Рассмотрена задача о распределении ресурсов между предприятиями разных отраслей в составе одного экономического конгломерата. Приведены различные способы постановки задачи и ввода исходных данных с учетом возможности построения собственных функций отдачи, управляющего действия и времени. Ключевым методом решения задачи является апарат динамического программирования Беллмана [1]. Исследована альтернативная формализация задачи, в которой фазовые и управляющие переменные могут принимать бесконечное количество значений, что делает невозможным применение стандартных для динамического программирования таблиц и приводит к необходимости аналитических расчетов. Предложен ряд ограничений, что сводят функции отдачи к виду, который удовлетворяет условиям производственных функций

    Improving the Caenorhabditis elegans Genome Annotation Using Machine Learning

    Get PDF
    For modern biology, precise genome annotations are of prime importance, as they allow the accurate definition of genic regions. We employ state-of-the-art machine learning methods to assay and improve the accuracy of the genome annotation of the nematode Caenorhabditis elegans. The proposed machine learning system is trained to recognize exons and introns on the unspliced mRNA, utilizing recent advances in support vector machines and label sequence learning. In 87% (coding and untranslated regions) and 95% (coding regions only) of all genes tested in several out-of-sample evaluations, our method correctly identified all exons and introns. Notably, only 37% and 50%, respectively, of the presently unconfirmed genes in the C. elegans genome annotation agree with our predictions, thus we hypothesize that a sizable fraction of those genes are not correctly annotated. A retrospective evaluation of the Wormbase WS120 annotation [1] of C. elegans reveals that splice form predictions on unconfirmed genes in WS120 are inaccurate in about 18% of the considered cases, while our predictions deviate from the truth only in 10%–13%. We experimentally analyzed 20 controversial genes on which our system and the annotation disagree, confirming the superiority of our predictions. While our method correctly predicted 75% of those cases, the standard annotation was never completely correct. The accuracy of our system is further corroborated by a comparison with two other recently proposed systems that can be used for splice form prediction: SNAP and ExonHunter. We conclude that the genome annotation of C. elegans and other organisms can be greatly enhanced using modern machine learning technology

    Recursive trimmed filter in eliminating high density impulse noise from digital image

    Get PDF
    Advances in technology have made it easier to share media over the Internet. In the process of media sharing, a media may receive noise or interference that results in loss of information. In this paper, a new method to remove Salt and Pepper noise from images based on recursive method will be presented. The first stage is to recognize the noise from the damaged image, the damaged pixels will be replaced by the mean of the surrounding window, the difference with other methods is the use of recursive approach that aims to minimize the size of the window in the recovery process

    Parallelizing Optimal Multiple Sequence Alignment by Dynamic Programming

    Full text link
    Optimal multiple sequence alignment by dynamic programming, like many highly dimensional scientific computing problems, has failed to benefit from the improvements in computing performance brought about by multi-processor systems, due to the lack of suitable scheme to manage partitioning and dependencies. A scheme for parallel implementation of the dynamic programming multiple sequence alignment is presented, based on a peer to peer design and a multidimensional array indexing method. This design results in up to 5-fold improvement compared to a previously described master/slave design, and scales favourably with the number of processors used. This study demonstrates an approach for parallelising multi-dimensional dynamic programming and similar algorithms utilizing multi-processor architectures

    Faster computation of exact RNA shape probabilities

    Get PDF
    Motivation: Abstract shape analysis allows efficient computation of a representative sample of low-energy foldings of an RNA molecule. More comprehensive information is obtained by computing shape probabilities, accumulating the Boltzmann probabilities of all structures within each abstract shape. Such information is superior to free energies because it is independent of sequence length and base composition. However, up to this point, computation of shape probabilities evaluates all shapes simultaneously and comes with a computation cost which is exponential in the length of the sequence

    Ambivalent covariance models

    Get PDF

    Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction

    Get PDF
    Janssen S, Schudoma C, Steger G, Giegerich R. Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction. BMC Bioinformatics. 2011;12(1): 429.BACKGROUND:Many bioinformatics tools for RNA secondary structure analysis are based on a thermodynamic model of RNA folding. They predict a single, "optimal" structure by free energy minimization, they enumerate near-optimal structures, they compute base pair probabilities and dot plots, representative structures of different abstract shapes, or Boltzmann probabilities of structures and shapes. Although all programs refer to the same physical model, they implement it with considerable variation for different tasks, and little is known about the effects of heuristic assumptions and model simplifications used by the programs on the outcome of the analysis.RESULTS:We extract four different models of the thermodynamic folding space which underlie the programs RNAfold, RNAshapes, and RNAsubopt. Their differences lie within the details of the energy model and the granularity of the folding space. We implement probabilistic shape analysis for all models, and introduce the shape probability shift as a robust measure of model similarity. Using four data sets derived from experimentally solved structures, we provide a quantitative evaluation of the model differences.CONCLUSIONS:We find that search space granularity affects the computed shape probabilities less than the over- or underapproximation of free energy by a simplified energy model. Still, the approximations perform similar enough to implementations of the full model to justify their continued use in settings where computational constraints call for simpler algorithms. On the side, we observe that the rarely used level 2 shapes, which predict the complete arrangement of helices, multiloops, internal loops and bulges, include the "true" shape in a rather small number of predicted high probability shapes. This calls for an investigation of new strategies to extract high probability members from the (very large) level 2 shape space of an RNA sequence. We provide implementations of all four models, written in a declarative style that makes them easy to be modified. Based on our study, future work on thermodynamic RNA folding may make a choice of model based on our empirical data. It can take our implementations as a starting point for further program development

    A Cloud Platform to support Collaboration in Supply Networks

    Get PDF
    [EN] Collaboration is a trend in supply networks management, based on the jointly planning, coordination and integration of processes, participating all network entities. Due to the current characteristics of uncertainty in the markets and economic crisis, there is a need to encourage collaboration tools to reduce costs and increase trust and accountability to market requirements. This study presents an overview of the research carried out in the H2020 European Project: Cloud Collaborative Manufacturing Networks (C2NET), which is directed towards the development a cloud platform that consist of, optimization tools, collaboration tools to support and agile management of the network. The collaborative cloud platform allows to collect real time information coming from real-world resources and considering all the actors involved in the process. The collaborative cloud provides real time data gathered from the entire network partners in order to improve their decision-making processes.The research leading to these results has received funding from European Community’s H2020 Programme (H2020/2014-2020) under grant agreement n°636909, “Cloud Collaborative Manufacturing Networks (C2NET)”.Andrés Navarro, B.; Sanchis, R.; Poler, R. (2016). A Cloud Platform to support Collaboration in Supply Networks. International Journal of Production Management and Engineering. 4(1):5-13. https://doi.org/10.4995/ijpme.2016.4418SWORD5134
    corecore