4,183 research outputs found

    MaxSSmap: A GPU program for mapping divergent short reads to genomes with the maximum scoring subsequence

    Get PDF
    Programs based on hash tables and Burrows-Wheeler are very fast for mapping short reads to genomes but have low accuracy in the presence of mismatches and gaps. Such reads can be aligned accurately with the Smith-Waterman algorithm but it can take hours and days to map millions of reads even for bacteria genomes. We introduce a GPU program called MaxSSmap with the aim of achieving comparable accuracy to Smith-Waterman but with faster runtimes. Similar to most programs MaxSSmap identifies a local region of the genome followed by exact alignment. Instead of using hash tables or Burrows-Wheeler in the first part, MaxSSmap calculates maximum scoring subsequence score between the read and disjoint fragments of the genome in parallel on a GPU and selects the highest scoring fragment for exact alignment. We evaluate MaxSSmap's accuracy and runtime when mapping simulated Illumina E.coli and human chromosome one reads of different lengths and 10\% to 30\% mismatches with gaps to the E.coli genome and human chromosome one. We also demonstrate applications on real data by mapping ancient horse DNA reads to modern genomes and unmapped paired reads from NA12878 in 1000 genomes. We show that MaxSSmap attains comparable high accuracy and low error to fast Smith-Waterman programs yet has much lower runtimes. We show that MaxSSmap can map reads rejected by BWA and NextGenMap with high accuracy and low error much faster than if Smith-Waterman were used. On short read lengths of 36 and 51 both MaxSSmap and Smith-Waterman have lower accuracy compared to at higher lengths. On real data MaxSSmap produces many alignments with high score and mapping quality that are not given by NextGenMap and BWA. The MaxSSmap source code is freely available from http://www.cs.njit.edu/usman/MaxSSmap

    A Lagrangian for Hamiltonian vector fields on singular Poisson manifolds

    Full text link
    On a manifold equipped with a bivector field, we introduce for every Hamiltonian a Lagrangian on paths valued in the cotangent space whose stationary points projects onto Hamiltonian vector fields. We show that the remaining components of those stationary points tell whether the bivector field is Poisson or at least defines an integrable distribution - a class of bivector fields generalizing twisted Poisson structures that we study in detail.Comment: 27 page

    A re-examination of the Salicornias (Amaranthaceae) of Saudi Arabia and their polymorphs

    Get PDF
    During the period from 1964 to 1999 Saudi Arabian species of Salicornia were wrongly treated under the European species, S. europaea L. Recent explorations proved that there are two separate allopatric species of Salicornia in Saudi Arabia, one inhabiting the inland salt-marshes of the Najd (highlands) and the other inhabiting the Arabian Gulf Coast (lowlands). Morphological, ecological and exploratory studies confirm that they are two distinct species. The two species differ in features of bark, axillary spikes, basal vegetative segment(s) of spike, fertile segments, colour of senescent plants, and flowering, fruiting and germination phenology. As both the species have been described earlier from Iran, they are now new records for Saudi Arabia. The species are, S. persica ssp. iranica (Akhani) Kadereit & Piirainen and S. sinus-persica Akhani. S. sinus-persica, of which the status was thought doubtful has been confirmed. Both the species have been described and illustrated. Each species comprises a number of polymorphs. As leaves and flowers are rudimentary, confusing species circumscriptions, a proliferation of binomials has resulted in the taxonomy of Salicornia. To mitigate such confusion, the full range of variability of the Saudi Arabian species has been documented

    Development and evaluation of machine learning algorithms for biomedical applications

    Get PDF
    Gene network inference and drug response prediction are two important problems in computational biomedicine. The former helps scientists better understand the functional elements and regulatory circuits of cells. The latter helps a physician gain full understanding of the effective treatment on patients. Both problems have been widely studied, though current solutions are far from perfect. More research is needed to improve the accuracy of existing approaches. This dissertation develops machine learning and data mining algorithms, and applies these algorithms to solve the two important biomedical problems. Specifically, to tackle the gene network inference problem, the dissertation proposes (i) new techniques for selecting topological features suitable for link prediction in gene networks; a graph sparsification method for network sampling; (iii) combined supervised and unsupervised methods to infer gene networks; and (iv) sampling and boosting techniques for reverse engineering gene networks. For drug sensitivity prediction problem, the dissertation presents (i) an instance selection technique and hybrid method for drug sensitivity prediction; (ii) a link prediction approach to drug sensitivity prediction; a noise-filtering method for drug sensitivity prediction; and (iv) transfer learning approaches for enhancing the performance of drug sensitivity prediction. Substantial experiments are conducted to evaluate the effectiveness and efficiency of the proposed algorithms. Experimental results demonstrate the feasibility of the algorithms and their superiority over the existing approaches

    American Options Based on Malliavin Calculus and Nonparametric Variance Reduction Methods

    Get PDF
    This paper is devoted to pricing American options using Monte Carlo and the Malliavin calculus. Unlike the majority of articles related to this topic, in this work we will not use localization fonctions to reduce the variance. Our method is based on expressing the conditional expectation E[f(St)/Ss] using the Malliavin calculus without localization. Then the variance of the estimator of E[f(St)/Ss] is reduced using closed formulas, techniques based on a conditioning and a judicious choice of the number of simulated paths. Finally, we perform the stopping times version of the dynamic programming algorithm to decrease the bias. On the one hand, we will develop the Malliavin calculus tools for exponential multi-dimensional diffusions that have deterministic and no constant coefficients. On the other hand, we will detail various nonparametric technics to reduce the variance. Moreover, we will test the numerical efficiency of our method on a heterogeneous CPU/GPU multi-core machine

    Two series of polyhedral fundamental domains for Lorentz bi-quotients

    Get PDF
    The main aim of this paper is to give two infinite series of examples of Lorentz space forms that can be obtained from Lorentz polyhedra by identification of faces. These Lorentz space forms are bi-quotients of the form Γ1\G/Γ2\Gamma_1\backslash G/\Gamma_2, where G=SU(1,1)~SL(2,R)~G=\widetilde{\operatorname{SU}(1,1)}\cong\widetilde{\operatorname{SL}(2,{\mathbb R})} is a simply connected Lie group with the Lorentz metric given by the Killing form, Γ1\Gamma_1 and Γ2\Gamma_2 are discrete subgroups of GG and Γ2\Gamma_2 is cyclic. A construction of polyhedral fundamental domains for the action of Γ1×Γ2\Gamma_1\times\Gamma_2 on GG via (g,h)x=gxh1(g,h)\cdot x=gxh^{-1} was given in the earlier work of the second author. In this paper we give an explicit description of the fundamental domains obtained by this construction for two infinite series of groups. These results are connected to singularity theory as the bi-quotients Γ1\G/Γ2\Gamma_1\backslash G/\Gamma_2 appear as links of certain quasi-homogeneous Q\mathbb Q-Gorenstein surface singularities, i.e.\ the intersections of the singular variety with sufficiently small spheres around the isolated singular point.Comment: 16 pages, 6 figures, 2 tables of figure

    Developing an IS-impact decision tool: A literature based design science roadmap

    Get PDF
    This paper derives from research-in-progress intending both Design Research (DR) and Design Science (DS) outputs; the former a management decision tool based in IS-Impact (Gable et al. 2008) kernel theory; the latter being methodological learnings deriving from synthesis of the literature and reflection on the DR ‘case study’ experience. The paper introduces a generic, detailed and pragmatic DS ‘Research Roadmap’ or methodology, deriving at this stage primarily from synthesis and harmonization of relevant concepts identified through systematic archival analysis of related literature. The scope of the Roadmap too has been influenced by the parallel study aim to undertake DR applying and further evolving the Roadmap. The Roadmap is presented in attention to the dearth of detailed guidance available to novice Researchers in Design Science Research (DSR), and though preliminary, is expected to evolve and gradually be substantiated through experience of its application. A key distinction of the Roadmap from other DSR methods is its breadth of coverage of published DSR concepts and activities; its detail and scope. It represents a useful synthesis and integration of otherwise highly disparate DSR-related concepts

    Determination of Seed Viability of Eight Wild Saudi Arabian Species by Germination and X-Ray Tests

    Get PDF
    Our purpose was to evaluate the usefulness of the germination vs. the X-ray test in determining the initial viability of seeds of eight wild species (Salvia spinosa, Salvia aegyptiaca, Ochradenus baccatus, Ochradenus arabicus, Suaeda aegyptiaca, Suaeda vermiculata, Prosopisfarcta and Panicumturgidum) from Saudi Arabia. Several days were required to determine viability of all eight species via germination tests, while immediate results on filled/viable seeds were obtained with the X-ray test. Seeds of all the species, except Sa.aegyptiaca, showed high viability in both the germination (98–70% at 25/15 °C, 93–66% at 35/25 °C) and X-ray (100–75%) test. Furthermore, there was general agreement between the germination (10% at 25/15 °C and 8% at 35/25 °C) and X-ray (5%) tests that seed viability of Sa.aegyptiaca was very low, and X-ray analysis revealed that this was due to poor embryo development. Seeds of P.farcta have physical dormancy, which was broken by scarification in concentrated sulfuric acid (10 min), and they exhibited high viability in both the germination (98% at 25/15 °C and 93% at 35/25 °C) and X-ray (98%) test. Most of the nongerminated seeds of the eight species except those of Sa.aegyptiaca were alive as judged by the tetrazolium test (TZ). Thus, for the eight species examined, the X-ray test was a good and rapid predictor of seed viability
    corecore