554 research outputs found

    Sorting permutations by weighted operations

    Get PDF
    Orientadores: Zanoni Dias, Carla Negri LintzmayerDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Calcular a distância evolucionária entre espécies é um problema importante da área de Biologia Computacional. Em várias abordagens, o cálculo dessa distância considera apenas rearranjos de genomas, os quais são conjuntos de mutações que alteram grandes trechos do genoma de um organismo. Assumindo que o genoma não possui genes duplicados, podemos representá-lo como permutações de números inteiros, em que cada elemento corresponde a um bloco conservado (região de alta similaridade entre os genomas comparados) e o sinal de cada elemento corresponde à orientação desse bloco. Ao usar permutações, o problema de transformar um genoma em outro é equivalente ao da Ordenação de Permutações por Operações de Rearranjo. A abordagem tradicional desse problema considera que todas as operações tem o mesmo custo e, assim, o objetivo é encontrar uma sequência mínima de operações que ordene a permutação. Entretanto, estudos indicam que algumas operações de rearranjo tem maior probabilidade de acontecer do que outras, fazendo com que abordagens em que operações possuem custos diferentes sejam mais realistas. Nas abordagens ponderadas, o objetivo é encontrar a sequência que ordena a permutação, de modo que a soma dos custos dos rearranjos dessa sequência seja mínimo. Neste trabalho, apresentamos algoritmos de aproximação para duas novas variações dos problemas da Ordenação de Permutações por Operações Ponderadas. A primeira variação utiliza uma função de custo correspondente à quantidade de fragmentações que a operação causa na permutação. A segunda variação utiliza uma função de custo proporcional ao tamanho da operação, além de adicionar a restrição de que as operações sejam curtas. Para cada uma das variações, foram considerados cinco problemas com os seguintes modelos de rearranjo: reversões sem sinais, reversões com sinais, transposições, reversões sem sinais e transposições, e reversões com sinais e transposições. Considerando os problemas da Ordenação de Permutações por Operações Ponderadas pelo Número de Fragmentações, apresentamos uma análise da relação entre os problemas não ponderados e essa variação, dois algoritmos de 2-aproximação para cada um dos cinco modelos de rearranjo, resultados experimentais desses algoritmos e limitantes inferiores e superiores para o diâmetro desses problemas. Além disso, apresentamos propriedades sobre permutações simples e um algoritmo de 1.5-aproximação assintótica para essa classe de permutações considerando reversões com sinais e/ou transposições. Para os problemas da Ordenação de Permutações por Operações Curtas Ponderadas pelo Tamanho, apresentamos uma análise da relação entre os problemas não ponderados e essa variação, algoritmos de aproximação com fator constante para cada um dos cinco modelos de rearranjo e resultados experimentais desses algoritmos. Além disso, fizemos uma análise do fator de aproximação dos algoritmos quando a função de custo é igual a l ^ alpha, onde l é o tamanho da operação e alpha > 1 é uma constanteAbstract: One of the main problems in Computational Biology is to find the evolutionary distance among species. In most approaches, such distance only involves rearrangements, which are mutations that alter large pieces of the species' genome. Considering that the genome has no repeated genes, we can represent them as signed permutations, where each element corresponds to a synteny block (region of high similarity between the genomes compared) and the sign of each element corresponds to the orientation of such blocks. When using permutations, the problem of transforming one genome into another is equivalent to the problem of Sorting Permutations by Rearrangement Operations. The traditional approach is to consider that any rearrangement has the same probability to happen, and so the goal is to find a minimum sequence of operations which sorts the permutation. However, studies have shown that some rearrangements are more likely to happen than others, and so a weighted approach is more realistic. In a weighted approach, the goal is to find a sequence which sorts the permutations, such that the sum of the rearrangements' cost of that sequence is minimum. In this work, we presented approximation algorithms for two new variations of the Sorting Permutations by Weighted Operations problem. The first variation uses a cost function related to the amount of fragmentation caused by a rearrangement. The second variation uses a cost function proportional to the rearrangement's length, along with the constraint that operations must be short. For each variation, we considered five problems with the following rearrangement models: unsigned reversals, signed reversals, transpositions, unsigned reversals and transpositions, and signed reversals and transpositions. Considering the problems of Sorting Permutations by Fragmentation-Weighted Operations, we presented an analysis of the relation between the traditional approach and this variation, 2-approximation algorithms for each rearrangement model, experimental results of these algorithms, and upper and lower bounds for the diameter of these problems. Besides that, we showed properties of simple permutations and a 1.5-asymptotic approximation algorithm for this class of permutation considering signed reversals and/or transpositions. Considering the problems of Sorting Permutations by Length-Weighted Short Operations, we presented an analysis of the relation between the traditional approach and this variation, approximation algorithms with constant factor for each rearrangement model, and experimental results for these algorithms. Besides that, we analyzed the approximation factor for the algorithms we developed when the cost function is equal to l ^ alpha, where l is the rearrangement's length and alpha > 1 is a constantMestradoCiência da ComputaçãoMestre em Ciência da Computação2017/16871-1131182/2017-0FAPESPCNP

    A New Measure for Analyzing and Fusing Sequences of Objects

    Get PDF
    This work is related to the combinatorial data analysis problem of seriation used for data visualization and exploratory analysis. Seriation re-sequences the data, so that more similar samples or objects appear closer together, whereas dissimilar ones are further apart. Despite the large number of current algorithms to realize such re-sequencing, there has not been a systematic way for analyzing the resulting sequences, comparing them, or fusing them to obtain a single unifying one. We propose a new positional proximity measure that evaluates the similarity of two arbitrary sequences based on their agreement on pairwise positional information of the sequenced objects. Furthermore, we present various statistical properties of this measure as well as its normalized version modeled as an instance of the generalized correlation coefficient. Based on this measure, we define a new procedure for consensus seriation that fuses multiple arbitrary sequences based on a quadratic assignment problem formulation and an efficient way of approximating its solution. We also derive theoretical links with other permutation distance functions and present their associated combinatorial optimization forms for consensus tasks. The utility of the proposed contributions is demonstrated through the comparison and fusion of multiple seriation algorithms we have implemented, using many real-world datasets from different application domains

    Dynamic set-up rules for hybrid flow shop scheduling with parallel batching machines

    Get PDF
    An S-stage hybrid (or flexible) flow shop, with sequence-independent uniform set-up times, parallel batching machines with compatible parallel batch families (like in casting or heat treatments in furnaces, chemical or galvanic baths, painting in autoclave, etc.) has been analysed with the purpose of reducing the number of tardy jobs (and the makespan); in Graham’s notation: FPB(m_1, m_2, … , m_S)|p-batch, STsi,b|SUM(Ui). Jobs are sorted dynamically (at each new delivery); batches are closed within sliding (or rolling) time windows and processed in parallel by multiple identical machines. Computation experiments have shown the better performance on benchmarks of the two proposed heuristics based on new formulations of the critical ratio (CRsetup) considering the ratio of allowance set-up and processing time in the scheduling horizon, which improves the weighted modified operation due date rule

    Combinatorial data analysis for data ordering

    Get PDF
    Seriation is a combinatorial optimisation problem that aims to sequence a set of objects such that a natural ordering is created. A large variety of applications exist ranging from archaeology to bioinformatics and text mining. Initially, a thorough and useful quantitative analysis compares different seriation algorithms using the positional proximity coefficient (PPC). This analysis helps the practitioner to understand how similar two algorithms are for a given set of datasets. The first contribution is consensus seriation. This method uses the principles of other consensus based methods to combine different seriation solutions according to the PPC. As it creates a solution that no individual algorithm can create, the usefulness comes in the form of combining different structural elements from each original algorithms. In particular, it is possible to create a solution that combines the local characteristics of one algorithm together with the global characteristics of another. Experimental results show that compared to consensus ranking based methods, using the Hamming, Spearman and Kendall coefficients, the consensus seriation using the PPC gives generally superior results according to the independent accumulated relative generalised anti-Robinson events measure. The second contribution is a metaheuristic for creating good approximation solutions very large seriation problems. This adapted harmony search algorithm makes use of modified crossover operators taken from genetic algorithm literature to optimise the least-squares criterion commonly used in seriation. As for all combinatorial optimisation problems, there is a need for metaheuristics that can produce better solutions quicker. Results show that that algorithm consistently outperforms existing metaheuristic algorithms such as genetic algorithm, particle swarm optimisation, simulated annealing and tabu search as well as the genetic algorithm using the modified crossover operators, with the main advantage of creating a much superior result in a very short iteration frame. These two major contributions offer practitioners and academics with new tools to tackle seriation related problems and a suggested direction for future work is concluded

    Precision measurements of the top quark mass from the Tevatron in the pre-LHC era

    Full text link
    The top quark is the heaviest of the six quarks of the Standard Model. Precise knowledge of its mass is important for imposing constraints on a number of physics processes, including interactions of the as yet unobserved Higgs boson. The Higgs boson is the only missing particle of the Standard Model, central to the electroweak symmetry breaking mechanism and generation of particle masses. In this Review, experimental measurements of the top quark mass accomplished at the Tevatron, a proton-antiproton collider located at the Fermi National Accelerator Laboratory, are described. Topologies of top quark events and methods used to separate signal events from background sources are discussed. Data analysis techniques used to extract information about the top mass value are reviewed. The combination of several most precise measurements performed with the two Tevatron particle detectors, CDF and \D0, yields a value of \Mt = 173.2 \pm 0.9 GeV/c2c^2.Comment: This version contains the most up-to-date top quark mass averag

    Data Structures & Algorithm Analysis in C++

    Get PDF
    This is the textbook for CSIS 215 at Liberty University.https://digitalcommons.liberty.edu/textbooks/1005/thumbnail.jp

    Scalable Community Detection

    Get PDF

    Sorting permutations by limited-size operations

    Get PDF
    Orientadores: Zanoni Dias, Carla Negri LintzmayerDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O resumo poderá ser visualizado no texto completo da tese digitalAbstract: The abstract is available with the full electronic digital documentMestradoCiência da ComputaçãoMestre em Ciência da ComputaçãoCAPE
    corecore