1,261 research outputs found
Regular resolution for CNF of bounded incidence treewidth with few long clauses
We demonstrate that Regular Resolution is FPT for two restricted families of
CNFs of bounded incidence treewidth. The first includes CNFs having at most
clauses whose removal results in a CNF of primal treewidth at most . The
parameters we use in this case are and . The second class includes CNFs
of bounded one-sided (incidence) treewdth, a new parameter generalizing both
primal treewidth and incidence pathwidth. The parameter we use in this case is
the one-sided treewidth
The Bolzano-Weierstrass Theorem is the Jump of Weak K\"onig's Lemma
We classify the computational content of the Bolzano-Weierstrass Theorem and
variants thereof in the Weihrauch lattice. For this purpose we first introduce
the concept of a derivative or jump in this lattice and we show that it has
some properties similar to the Turing jump. Using this concept we prove that
the derivative of closed choice of a computable metric space is the cluster
point problem of that space. By specialization to sequences with a relatively
compact range we obtain a characterization of the Bolzano-Weierstrass Theorem
as the derivative of compact choice. In particular, this shows that the
Bolzano-Weierstrass Theorem on real numbers is the jump of Weak K\"onig's
Lemma. Likewise, the Bolzano-Weierstrass Theorem on the binary space is the
jump of the lesser limited principle of omniscience LLPO and the
Bolzano-Weierstrass Theorem on natural numbers can be characterized as the jump
of the idempotent closure of LLPO. We also introduce the compositional product
of two Weihrauch degrees f and g as the supremum of the composition of any two
functions below f and g, respectively. We can express the main result such that
the Bolzano-Weierstrass Theorem is the compositional product of Weak K\"onig's
Lemma and the Monotone Convergence Theorem. We also study the class of weakly
limit computable functions, which are functions that can be obtained by
composition of weakly computable functions with limit computable functions. We
prove that the Bolzano-Weierstrass Theorem on real numbers is complete for this
class. Likewise, the unique cluster point problem on real numbers is complete
for the class of functions that are limit computable with finitely many mind
changes. We also prove that the Bolzano-Weierstrass Theorem on real numbers
and, more generally, the unbounded cluster point problem on real numbers is
uniformly low limit computable. Finally, we also discuss separation techniques.Comment: This version includes an addendum by Andrea Cettolo, Matthias
Schr\"oder, and the authors of the original paper. The addendum closes a gap
in the proof of Theorem 11.2, which characterizes the computational content
of the Bolzano-Weierstra\ss{} Theorem for arbitrary computable metric space
Scheduling malleable task trees
Solving sparse linear systems can lead to processing tree workflows on a platform of processors. In this study, we use the model of malleable tasks motivated in [Prasanna96,Beaumont07] in order to study tree workflow schedules under two contradictory objectives: makespan minimization and memory minization. First, we give a simpler proof of the result of [Prasanna96] which allows to compute a makespan-optimal schedule for tree workflows. Then, we study a more realistic speed-up function and show that the previous schedules are not optimal in this context. Finally, we give complexity results concerning the objective of minimizing both makespan and memory
Protein secondary structure prediction using BLAST and relaxed threshold rule induction from coverings
Protein structure prediction has always been an important research area in bioinformatics and biochemistry. Despite the recent breakthrough of combining multiple sequence alignment information and artificial intelligence algorithms to predict protein secondary structure, the Q₃ accuracy of various computational prediction methods rarely has exceeded 75%; this status has changed little since 2003 when Rost stated that the currently best methods reach a level around 77% three-state per-residue accuracy. The application of artificial neural network methods to this problem is revolutionary in the sense that those techniques employ the homologues of proteins for training and prediction. In this dissertation, a different approach, RT-RICO (Relaxed Threshold Rule Induction from Coverings), is presented that instead uses association rule mining. This approach still makes use of the fundamental principle that structure is more conserved than sequence. However, rules between each known secondary structure element and its neighboring amino acid residues are established to perform the predictions. This dissertation consists of five research articles that discuss different prediction techniques and detailed rule-generation algorithms. The most recent prediction approach, BLAST-RT-RICO, achieved a Q₃ accuracy score of 89.93% on the standard test dataset RS126 and a Q₃ score of 87.71% on the standard test dataset CB396, an improvement over comparable computational methods. Herein one research article also discusses the results of examining those RT-RICO rules using an existing association rule visualization tool, modified to account for the non-Boolean characterization of protein secondary structure --Abstract, page iv
Topology-aware GPU scheduling for learning workloads in cloud environments
Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs. Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud environments.
This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to ≈1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.This project is supported by the IBM/BSC Technology Center for Supercomputing
collaboration agreement. It has also received funding from the European Research Council (ERC) under the European Union’s Horizon
2020 research and innovation programme (grant agreement No 639595). It is
also partially supported by the Ministry of Economy of Spain under contract
TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051,
by the ICREA Academia program, and by the BSC-CNS Severo Ochoa program
(SEV-2015-0493). We thank our IBM Research colleagues Alaa Youssef
and Asser Tantawi for the valuable discussions. We also thank SC17 committee
member Blair Bethwaite of Monash University for his constructive feedback on the earlier drafts of this paper.Peer ReviewedPostprint (published version
Compile-time support for thread-level speculation
Una de las principales preocupaciones de las ciencias de la computación es el estudio de las capacidades paralelas tanto de programas como de los procesadores que los ejecutan. Existen varias razones que hacen muy deseable el desarrollo de técnicas que paralelicen automáticamente el código. Entre ellas se encuentran el inmenso número de programas secuenciales existentes ya escritos, la complejidad de los lenguajes de programación paralelos, y los conocimientos que se requieren para paralelizar un código. Sin embargo, los actuales mecanismos de paralelización automática implementados en los compiladores comerciales no son capaces de paralelizar la mayorÃa de los bucles en un código [1], debido a la dependencias de datos que existen entre ellos [2]. Por lo tanto, se hace necesaria la búsqueda de nuevas técnicas, como la paralelización especulativa [3-5], que saquen beneficio de las potenciales capacidades paralelas del hardware y arquitecturas multiprocesador actuales. Sin embargo, ésta y otras técnicas requieren la intervención manual de programadores experimentados.
Antes de ofrecer soluciones alternativas, se han evaluado las capacidades de paralelización de los compiladores comerciales, exponiendo las limitaciones de los mecanismos de paralelización automática que implementan. El estudio revela que estos mecanismos de paralelización automática sólo alcanzan un 19% de speedup en promedio para los benchmarks del SPEC CPU2006 [6], siendo este un resultado significativamente inferior al obtenido por técnicas de paralelización especulativa [7]. Sin embargo, la paralelización especulativa requiere una extensa modificación manual del código por parte de programadores.
Esta Tesis aborda este problema definiendo una nueva cláusula OpenMP [8], llamada ¿speculative¿, que permite señalar qué variables pueden llevar a una violación de dependencia. Además, esta Tesis también propone un sistema en tiempo de compilación que, usando la información sobre los accesos a las variables que proporcionan las cláusulas OpenMP, añade automáticamente todo el código necesario para gestionar la ejecución especulativa de un programa. Esto libera al programador de modificar el código manualmente, evitando posibles errores y una tediosa tarea. El código generado por nuestro sistema enlaza con la librerÃa de ejecución especulativamente paralela desarrollada por Estebanez, GarcÃa-Yagüez, Llanos y Gonzalez-Escribano [9,10].Departamento de Informática (Arquitectura y TecnologÃa de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos
- …