1,027 research outputs found
Reconstrucción discursiva del pasado y reescritura de la historia
This paper aims to review some of the basic assumptions that operate in the usual image of historical time. For this it is necessary to discuss the theoretical attacks that the constructivist view has wielded against cognitive positivism, especially regarding the debate on the character and the nature of the historiographical operation.Esta ponencia pretende revisar algunos de los supuestos básicos que operan en la imagen habitual del tiempo histórico. Para ello será preciso adentrarse en los ataques teóricos que la visión constructivista ha ejercido contra el positivismo cognoscitivo, especialmente en lo que concierne al debate sobre el carácter y la naturaleza de la operación historiográfica
ParDRe: faster parallel duplicated reads removal tool for sequencing studies
This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record [insert complete citation information here] is available online at: https://doi.org/10.1093/bioinformatics/btw038[Abstract] Summary: Current next generation sequencing technologies often generate duplicated or near-duplicated reads that (depending on the application scenario) do not provide any interesting biological information but can increase memory requirements and computational time of downstream analysis. In this work we present ParDRe , a de novo parallel tool to remove duplicated and near-duplicated reads through the clustering of Single-End or Paired-End sequences from fasta or fastq files. It uses a novel bitwise approach to compare the suffixes of DNA strings and employs hybrid MPI/multithreading to reduce runtime on multicore systems. We show that ParDRe is up to 27.29 times faster than Fulcrum (a representative state-of-the-art tool) on a platform with two 8-core Sandy-Bridge processors.
Availability and implementation: Source code in C ++ and MPI running on Linux systems as well as a reference manual are available at https://sourceforge.net/projects/pardre
GPU-accelerated exhaustive search for third-order epistatic interactions in case–control studies
This is a post-peer-review, pre-copyedit version of an article published in Journal of Computational Science. The final authenticated version is available online at: https://doi.org/10.1016/j.jocs.2015.04.001[Abstract] Interest in discovering combinations of genetic markers from case–control studies, such as Genome Wide Association Studies (GWAS), that are strongly associated to diseases has increased in recent years. Detecting epistasis, i.e. interactions among k markers (k ≥ 2), is an important but time consuming operation since statistical computations have to be performed for each k-tuple of measured markers. Efficient exhaustive methods have been proposed for k = 2, but exhaustive third-order analyses are thought to be impractical due to the cubic number of triples to be computed. Thus, most previous approaches apply heuristics to accelerate the analysis by discarding certain triples in advance. Unfortunately, these tools can fail to detect interesting interactions. We present GPU3SNP, a fast GPU-accelerated tool to exhaustively search for interactions among all marker-triples of a given case–control dataset. Our tool is able to analyze an input dataset with tens of thousands of markers in reasonable time thanks to two efficient CUDA kernels and efficient workload distribution techniques. For instance, a dataset consisting of 50,000 markers measured from 1000 individuals can be analyzed in less than 22 h on a single compute node with 4 NVIDIA GTX Titan boards. Source code is available at: http://sourceforge.net/projects/gpu3snp/
BigDEC: A multi-algorithm Big Data tool based on the k-mer spectrum method for scalable short-read error correction
Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract]: Despite the significant improvements in both throughput and cost provided by modern Next-Generation Sequencing (NGS) platforms, sequencing errors in NGS datasets can still degrade the quality of downstream analysis. Although state-of-the-art correction tools can provide high accuracy to improve such analysis, they are limited to apply a single correction algorithm while also requiring long runtimes when processing large NGS datasets. Furthermore, current parallel correctors generally only provide efficient support for shared-memory systems lacking the ability to scale out across a cluster of multicore nodes, or they require the availability of specific hardware devices or features. In this paper we present a Big Data Error Correction (BigDEC) tool that overcomes all those limitations by: (1) implementing three different error correction algorithms based on the widely extended k-mer spectrum method; (2) providing scalable performance for large datasets by efficiently exploiting the capabilities of Big Data technologies on multicore clusters based on commodity hardware; (3) supporting two different Big Data processing frameworks (Spark and Flink) to provide greater flexibility to end users; (4) including an efficient, stream-based merge operation to ease downstream processing of the corrected datasets; and (5) significantly outperforming existing parallel tools, being up to 79% faster on a 16-node multicore cluster when using the same underlying correction algorithm. BigDEC is publicly available to download at https://github.com/UDC-GAC/BigDEC.This work was supported by grants PID2019-104184RB-I00 and PID2022-136435NB-I00, funded by the Ministry of Science and Innovation of Spain, MCIN/AEI/10.13039/501100011033 (PID2022 also funded by “ERDF A way of making Europe”, EU). It was also funded by Xunta de Galicia [Consolidation Program of Competitive Reference Groups, grant ED431C 2021/30]. Funding for open access charge: Universidade da Coruña/CISUG.Xunta de Galicia; ED431C 2021/3
Accelerating binary biclustering on platforms with CUDA-enabled GPUs
© 2018 Elsevier B.V. All rights reserved. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/bync-nd/4.0/. This version of the article has been accepted for publication in Information Sciences. The Version of Record is available online at https://doi.org/10.1016/j.ins.2018.05.025This is a version of: J. González-Domínguez and R. R. Expósito, "Accelerating binary biclustering on platforms with CUDA-enabled GPUs", Information Sciences, Vol. 496, Sept. 2019, pp. 317-325, https://doi.org/10.1016/j.ins.2018.05.025[Abstract]: Data mining is nowadays essential in many scientific fields to extract valuable information from large input datasets and transform it into an understandable structure. For instance, biclustering techniques are very useful in identifying subsets of two-dimensional data where both rows and columns are correlated. However, some biclustering techniques have become extremely time-consuming when processing very large datasets, which nowadays prevents their use in many areas of research and industry (such as bioinformatics) that have experienced an explosive growth on the amount of available data. In this work we present CUBiBit, a tool that accelerates the search for relevant biclusters on binary data by exploiting the computational capabilities of CUDA-enabled GPUs as well as the several CPU cores available in most current systems. The experimental evaluation has shown that CUBiBit is up to 116 times faster than the fastest state-of-the-art tool, BiBit, in a system with two Intel Sandy Bridge processors (16 CPU cores) and three NVIDIA K20 GPUs. CUBiBit is publicly available to download from https://sourceforge.net/projects/cubibitThis work was supported by the Ministry of Economy, Industry and Competitiveness of Spain and FEDER funds of the European Union [grant TIN2016-75845-P (AEI/FEDER/UE)], as well as by Xunta de Galicia (Centro Singular de Investigacion de Galicia accreditation 2016-2019, ref. EDG431G/01).Xunta de Galicia; EDG431G/0
ParBiBit: Parallel tool for binary biclustering on modern distributed-memory systems
[Abstract]: Biclustering techniques are gaining attention in the analysis of large-scale datasets as they identify two-dimensional submatrices where both rows and columns are correlated. In this work we present ParBiBit, a parallel tool to accelerate the search of interesting biclusters on binary datasets, which are very popular on different fields such as genetics, marketing or text mining. It is based on the state-of-the-art sequential Java tool BiBit, which has been proved accurate by several studies, especially on scenarios that result on many large biclusters. ParBiBit uses the same methodology as BiBit (grouping the binary information into patterns) and provides the same results. Nevertheless, our tool significantly improves performance thanks to an efficient implementation based on C++11 that includes support for threads and MPI processes in order to exploit the compute capabilities of modern distributed-memory systems, which provide several multicore CPU nodes interconnected through a network. Our performance evaluation with 18 representative input datasets on two different eight-node systems shows that our tool is significantly faster than the original BiBit. Source code in C++ and MPI running on Linux systems as well as a reference manual are available at https://sourceforge.net/projects/parbibit/.This work was supported by the Ministry of Economy, Industry and Competitiveness of Spain and FEDER funds of the European Union [grant TIN2016-75845-P (AEI/FEDER/UE)], as well as by Xunta de Galicia (Centro Singular de Investigacion de Galicia accreditation 2016-2019, ref. EDG431G/01).Xunta de Galicia; EDG431G/0
Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++
[Abstract]: The growth of next-generation sequencing (NGS) datasets poses a challenge to the alignment of reads to reference genomes in terms of alignment quality and execution speed. Some available aligners have been shown to obtain high quality mappings at the expense of long execution times. Finding fast yet accurate software solutions is of high importance to research, since availability and size of NGS datasets continue to increase. In this work we present an efficient parallelization approach for NGS short-read alignment on multi-core clusters. Our approach takes advantage of a distributed shared memory programming model based on the new UPC++ language. Experimental results using the CUSHAW3 aligner show that our implementation based on dynamic scheduling obtains good scalability on multi-core clusters. Through our evaluation, we are able to complete the single-end and paired-end alignments of 246 million reads of length 150 base-pairs in 11.54 and 16.64 minutes, respectively, using 32 nodes with four AMD Opteron 6272 16-core CPUs per node. In contrast, the multi-threaded original tool needs 2.77 and 5.54 hours to perform the same alignments on the 64 cores of one node. The source code of our parallel implementation is publicly available at the CUSHAW3 homepage (http://cushaw3.sourceforge.net).[Resumen]: El crecimiento de los conjuntos de datos de "secuenciamiento de próxima generación" (NGS por sus siglas en inglés) es un reto respecto a la calidad y a la velocidad de alineamientos de secuencias a genomas de referencia. Algunos alineadores disponibles obtienen mapeados de alta calidad a expensas de largos tiempos de ejecución. Desarrollar software rápido y preciso es muy importante para la investigación, ya que la disponibilidad y tamaño de los conjuntos NGS continua creciendo. En este trabajo presentamos una paralelización eficiente para el alineamiento de secuencias cortas de NGS en sistemas con nodos de múltiples núcleos de computación. Nuestra aproximación se aprovecha de un modelo de programación distribuida-compartida basado en el nuevo lenguaje UPC++. Los resultados experimentales usando el alineador CUSHAW3 muestran que nuestra implementación basada en reparto dinámico de trabajo obtiene buena escalabilidad. En nuestra evaluación somos capaces de completar alineamientos sencillos y en parejas de 246 millones de secuencias de longitud 150 en 11.54 y 16.64 minutos, respectivamente, usando 32 nodos con cuatro AMD Opteron 6272 y 16 núcleos de CPU cada uno. Sin embargo, la herramienta multi-hilo original necesita 2.77 y 5.54 horas para completar los mismos alineamientos en los 64 núcleos de un nodo. El código fuente de nuestra implementación paralela está disponible públicamente en la web de CUSHAW3 (http://cushaw3.sourceforge.net).[Resumo]: O medre dos conxuntos de datos de "secuenzamento de próxima xeración" (NGS polas súas siglas en inglés) é un reto respecto á calidade e á velocidade dos aliñamentos de secuencias a xenomas de referencia. Algúns aliñadores disponibles obteñen mapeados de alta calidade a expensas de largos tempos de execución. Desenvolver software rápido e preciso é moi importante para a investigación, xa que a disponibilidade e tamaño dos conxuntos NGS continua a medrar. Neste traballo presentamos unha paralelización eficiente para o aliñamiento de secuencias cortas de NGS en sistemas con nodos de múltiples núcleos de computación. A nosa aproximación aproveitase dun modelo de programación distribuida-compartida basado na nova linguaxe UPC++. Os resultados experimentais que fan uso do aliñador CUSHAW3 mostran que a nosa implementación baseada en reparto dinámico de traballo obtén boa escalabilidade. Na nosa avaliación somos capaces de completar aliñamentos sinxelos e en parellas de 246 millóns de secuencias de lonxitude 150 en 11.54 e 16.64 minutos, respectivamente, empregando 32 nodos con catro AMD Opteron 6272 e 16 núcleos de CPU cada un. Sen embargo, a ferramenta multi-fío oxiginal necesita 2.77 e 5.54 horas para completar os mesmos aliñamientos nos 64 núcleos dun nodo. O código fonte da nosa implementación paralela está disponible públicamente na web de CUSHAW3 (http://cushaw3.sourceforge.net)
UPCBLAS : a numerical library for unified parallel C with architecture-aware optimizations
[Abstract] The popularity of Partitioned Global Address Space (PGAS) languages has increased during the last years thanks to their high programmability and performance through an efficient exploitation of data locality, especially on hierarchical architectures like multicore clusters. This PhD Thesis describes UPCBLAS, a parallel library for numerical computation using the PGAS Unified Parallel C (UPC) language. The routines are built on top of sequential BLAS and SparseBLAS functions and exploit the particularities of the PGAS paradigm, taking into account data locality in order to achieve a good performance. However, the growing complexity in computer system hierarchies due to the increase in the number of cores per processor, levels of cache (some of them shared) and the number of processors per node, as well as the high-speed interconnects, demands the use of new optimization techniques and libraries that take advantage of their features. For this reason, this Thesis also presents Servet, a suite of benchmarks focused on detecting a set of parameters with high in uence on the overall performance of multicore systems. UPCBLAS routines use the hardware parameters provided by Servet to implement optimization techniques that improve their performance. The performance of the library has been experimentally evaluated on several multicore supercomputers and compared to message-passing-based parallel numerical libraries, demonstrating good scalability and efficiency. UPCBLAS has also been used to develop more complex numerical codes in order to demonstrate that it is a good alternative to MPI-based libraries for increasing the productivity of numerical application developers
Matemáticas y autismo: Algunos métodos del proceso enseñanza-aprendizaje.
En este Trabajo de Fin de Grado se analizan distintas fuentes bibliográficas en relación
con la investigación sobre los procesos de enseñanza-aprendizaje de alumnos y alumnas
con Necesidades Educativas Especiales y, de manera más particular, con aplicación a la
metodología de la enseñanza de las matemáticas en cuanto al alumnado de Educación
Primaria con autismo se refiere. Se analizan distintos métodos que han probado su utilidad
al respecto, destacando la necesidad de un mayor análisis en aquellos casos de trastornos
del espectro autista para los que, lamentablemente, el número de artículos de investigación
especializados no son tan numerosos como en otros casos en los que las discapacidades
son de otro tipo.In this Bachelor Thesis, a review of the different articles in the specialized literature
regarding the teaching-learning processes of students with Special Educational Needs is
developed. Special emphasis is made in relation to the methodology of teaching
mathematics to Primary Education students with autism. Different methods whose
effectiveness have been proved are analyzed, showing up the necessity of a deeper
analysis in those cases when the scholars have autism spectrum disorders, as the number
of references in these cases is much lower than in other cases of disabilities of another
kind
- …