3,071,853 research outputs found

    Efficient Computation of Sequence Mappability

    Get PDF
    Sequence mappability is an important task in genome re-sequencing. In the (k,m)(k,m)-mappability problem, for a given sequence TT of length nn, our goal is to compute a table whose iith entry is the number of indices jij \ne i such that length-mm substrings of TT starting at positions ii and jj have at most kk mismatches. Previous works on this problem focused on heuristic approaches to compute a rough approximation of the result or on the case of k=1k=1. We present several efficient algorithms for the general case of the problem. Our main result is an algorithm that works in O(nmin{mk,logk+1n})\mathcal{O}(n \min\{m^k,\log^{k+1} n\}) time and O(n)\mathcal{O}(n) space for k=O(1)k=\mathcal{O}(1). It requires a carefu l adaptation of the technique of Cole et al.~[STOC 2004] to avoid multiple counting of pairs of substrings. We also show O(n2)\mathcal{O}(n^2)-time algorithms to compute all results for a fixed mm and all k=0,,mk=0,\ldots,m or a fixed kk and all m=k,,n1m=k,\ldots,n-1. Finally we show that the (k,m)(k,m)-mappability problem cannot be solved in strongly subquadratic time for k,m=Θ(logn)k,m = \Theta(\log n) unless the Strong Exponential Time Hypothesis fails.Comment: Accepted to SPIRE 201

    Efficient Algorithms for Scheduling Moldable Tasks

    Full text link
    We study the problem of scheduling nn independent moldable tasks on mm processors that arises in large-scale parallel computations. When tasks are monotonic, the best known result is a (32+ϵ)(\frac{3}{2}+\epsilon)-approximation algorithm for makespan minimization with a complexity linear in nn and polynomial in logm\log{m} and 1ϵ\frac{1}{\epsilon} where ϵ\epsilon is arbitrarily small. We propose a new perspective of the existing speedup models: the speedup of a task TjT_{j} is linear when the number pp of assigned processors is small (up to a threshold δj\delta_{j}) while it presents monotonicity when pp ranges in [δj,kj][\delta_{j}, k_{j}]; the bound kjk_{j} indicates an unacceptable overhead when parallelizing on too many processors. For a given integer δ5\delta\geq 5, let u=δ21u=\left\lceil \sqrt[2]{\delta} \right\rceil-1. In this paper, we propose a 1θ(δ)(1+ϵ)\frac{1}{\theta(\delta)} (1+\epsilon)-approximation algorithm for makespan minimization with a complexity O(nlognϵlogm)\mathcal{O}(n\log{\frac{n}{\epsilon}}\log{m}) where θ(δ)=u+1u+2(1km)\theta(\delta) = \frac{u+1}{u+2}\left( 1- \frac{k}{m} \right) (mkm\gg k). As a by-product, we also propose a θ(δ)\theta(\delta)-approximation algorithm for throughput maximization with a common deadline with a complexity O(n2logm)\mathcal{O}(n^{2}\log{m})

    Vinogradov systems with a slice off

    Get PDF
    Let Is,k,r(X)I_{s,k,r}(X) denote the number of integral solutions of the modified Vinogradov system of equations x1j++xsj=y1j++ysj(1jkjr),x_1^j+\ldots +x_s^j=y_1^j+\ldots +y_s^j\quad (\text{$1\le j\le k$, $j\ne r$}), with 1xi,yiX1\le x_i,y_i\le X (1is)(1\le i\le s). By exploiting sharp estimates for an auxiliary mean value, we obtain bounds for Is,k,r(X)I_{s,k,r}(X) for 1rk11\le r\le k-1. In particular, when s,kNs,k\in \mathbb N satisfy k3k\ge 3 and 1s(k21)/21\le s\le (k^2-1)/2, we establish the essentially diagonal behaviour Is,k,1(X)Xs+ϵI_{s,k,1}(X)\ll X^{s+\epsilon}.Comment: 19 page

    Efficient self-sustained pulsed CO laser

    Get PDF
    In this paper a simple sealed-off TEA CO laser is described with a self-sustained discharge without an external UV preionization source. At 77 K this system yields more than 600 mJ from a lasing volume of about 60 cm3 CO-N2-He mixture (45 J/ℓ atm. with 15.6% efficiency)

    Cache-Oblivious Selection in Sorted X+Y Matrices

    Full text link
    Let X[0..n-1] and Y[0..m-1] be two sorted arrays, and define the mxn matrix A by A[j][i]=X[i]+Y[j]. Frederickson and Johnson gave an efficient algorithm for selecting the k-th smallest element from A. We show how to make this algorithm IO-efficient. Our cache-oblivious algorithm performs O((m+n)/B) IOs, where B is the block size of memory transfers

    Statistics of Partial Minima

    Full text link
    Motivated by multi-objective optimization, we study extrema of a set of N points independently distributed inside the d-dimensional hypercube. A point in this set is k-dominated by another point when at least k of its coordinates are larger, and is a k-minimum if it is not k-dominated by any other point. We obtain statistical properties of these partial minima using exact probabilistic methods and heuristic scaling techniques. The average number of partial minima, A, decays algebraically with the total number of points, A ~ N^{-(d-k)/k}, when 1<=k<d. Interestingly, there are k-1 distinct scaling laws characterizing the largest coordinates as the distribution P(y_j) of the jth largest coordinate, y_j, decays algebraically, P(y_j) ~ (y_j)^{-alpha_j-1}, with alpha_j=j(d-k)/(k-j) for 1<=j<=k-1. The average number of partial minima grows logarithmically, A ~ [1/(d-1)!](ln N)^{d-1}, when k=d. The full distribution of the number of minima is obtained in closed form in two-dimensions.Comment: 6 pages, 1 figur
    corecore