573 research outputs found

    On the use of biased-randomized algorithms for solving non-smooth optimization problems

    Get PDF
    Soft constraints are quite common in real-life applications. For example, in freight transportation, the fleet size can be enlarged by outsourcing part of the distribution service and some deliveries to customers can be postponed as well; in inventory management, it is possible to consider stock-outs generated by unexpected demands; and in manufacturing processes and project management, it is frequent that some deadlines cannot be met due to delays in critical steps of the supply chain. However, capacity-, size-, and time-related limitations are included in many optimization problems as hard constraints, while it would be usually more realistic to consider them as soft ones, i.e., they can be violated to some extent by incurring a penalty cost. Most of the times, this penalty cost will be nonlinear and even noncontinuous, which might transform the objective function into a non-smooth one. Despite its many practical applications, non-smooth optimization problems are quite challenging, especially when the underlying optimization problem is NP-hard in nature. In this paper, we propose the use of biased-randomized algorithms as an effective methodology to cope with NP-hard and non-smooth optimization problems in many practical applications. Biased-randomized algorithms extend constructive heuristics by introducing a nonuniform randomization pattern into them. Hence, they can be used to explore promising areas of the solution space without the limitations of gradient-based approaches, which assume the existence of smooth objective functions. Moreover, biased-randomized algorithms can be easily parallelized, thus employing short computing times while exploring a large number of promising regions. This paper discusses these concepts in detail, reviews existing work in different application areas, and highlights current trends and open research lines

    Algorithms for Viral Population Analysis

    Get PDF
    The genetic structure of an intra-host viral population has an effect on many clinically important phenotypic traits such as escape from vaccine induced immunity, virulence, and response to antiviral therapies. Next-generation sequencing provides read-coverage sufficient for genomic reconstruction of a heterogeneous, yet highly similar, viral population; and more specifically, for the detection of rare variants. Admittedly, while depth is less of an issue for modern sequencers, the short length of generated reads complicates viral population assembly. This task is worsened by the presence of both random and systematic sequencing errors in huge amounts of data. In this dissertation I present completed work for reconstructing a viral population given next-generation sequencing data. Several algorithms are described for solving this problem under the error-free amplicon (or sliding-window) model. In order for these methods to handle actual real-world data, an error-correction method is proposed. A formal derivation of its likelihood model along with optimization steps for an EM algorithm are presented. Although these methods perform well, they cannot take into account paired-end sequencing data. In order to address this, a new method is detailed that works under the error-free paired-end case along with maximum a-posteriori estimation of the model parameters

    Computational Methods for Sequencing and Analysis of Heterogeneous RNA Populations

    Get PDF
    Next-generation sequencing (NGS) and mass spectrometry technologies bring unprecedented throughput, scalability and speed, facilitating the studies of biological systems. These technologies allow to sequence and analyze heterogeneous RNA populations rather than single sequences. In particular, they provide the opportunity to implement massive viral surveillance and transcriptome quantification. However, in order to fully exploit the capabilities of NGS technology we need to develop computational methods able to analyze billions of reads for assembly and characterization of sampled RNA populations. In this work we present novel computational methods for cost- and time-effective analysis of sequencing data from viral and RNA samples. In particular, we describe: i) computational methods for transcriptome reconstruction and quantification; ii) method for mass spectrometry data analysis; iii) combinatorial pooling method; iv) computational methods for analysis of intra-host viral populations

    A Graph-Theoretic Barcode Ordering Model for Linked-Reads

    Get PDF
    Considering a set of intervals on the real line, an interval graph records these intervals as nodes and their intersections as edges. Identifying (i.e. merging) pairs of nodes in an interval graph results in a multiple-interval graph. Given only the nodes and the edges of the multiple-interval graph without knowing the underlying intervals, we are interested in the following questions. Can one determine how many intervals correspond to each node? Can one compute a walk over the multiple-interval graph nodes that reflects the ordering of the original intervals? These questions are closely related to linked-read DNA sequencing, where barcodes are assigned to long molecules whose intersection graph forms an interval graph. Each barcode may correspond to multiple molecules, which complicates downstream analysis, and corresponds to the identification of nodes of the corresponding interval graph. Resolving the above graph-theoretic problems would facilitate analyses of linked-reads sequencing data, through enabling the conceptual separation of barcodes into molecules and providing, through the molecules order, a skeleton for accurately assembling the genome. Here, we propose a framework that takes as input an arbitrary intersection graph (such as an overlap graph of barcodes) and constructs a heuristic approximation of the ordering of the original intervals

    Common operation scheduling with general processing times: A branch-and-cut algorithm to minimize the weighted number of tardy jobs

    Get PDF
    Common operation scheduling (COS) problems arise in real-world applications, such as industrial processes of material cutting or component dismantling. In COS, distinct jobs may share operations, and when an operation is done, it is done for all the jobs that share it. We here propose a 0-1 LP formulation with exponentially many inequalities to minimize the weighted number of tardy jobs. Separation of inequalities is in NP, provided that an ordinary min Lmax scheduling problem is in P. We develop a branch-and-cut algorithm for two cases: one machine with precedence relation; identical parallel machines with unit operation times. In these cases separation is the constrained maximization of a submodular set function. A previous method is modified to tackle the two cases, and compared to our algorithm. We report on tests conducted on both industrial and artificial instances. For single machine and general processing times the new method definitely outperforms the other, extending in this way the range of COS applications

    On Fork-Join Queues and Maximum Ratio Cliques

    Get PDF
    This dissertation consists of two parts. The first part delves into the problem of response time estimation in fork-join queueing networks. These systems have been seen in literature for more than thirty years. The estimation of the mean response time in these systems has been found to be notoriously hard for most forms of these queueing systems. In this work, simple expressions for the mean response time are proposed as conjectures. Extensive experiments demonstrate the remarkable accuracy of these conjectures. Algorithms for the estimation of response time using these conjectures are proposed. For many of the networks studied in this dissertation, no approximations are known in literature for estimation of their response time. Therefore, the contribution of this dissertation in this direction marks significant progress in the analysis of fork-join queues. The second part of this dissertation introduces a fractional version of the classical maximum weight clique problem, the maximum ratio clique problem, which is to find a maximal clique that has the largest ratio of benefit and cost weights associated with the cliques vertices. This problem is formulated to model networks in which the vertices have a benefit as well as a cost associated with them. The maximum ratio clique problem finds applications in a wide range of areas including social networks, stock market graphs and wind farm location. NP-completeness of the decision version of the problem is established, and three solution methods are proposed. The results of numerical experiments with standard graph instances, as well as with real-life instances arising in finance and energy systems, are reported
    corecore