20 research outputs found

    Computational and Experimental Evaluation of the Immune Response of Neoantigens for Personalized Vaccine Design

    Get PDF
    In the last few years, the importance of neoantigens in the development of personalized antitumor vaccines has increased remarkably. In order to study whether bioinformatic tools are effective in detecting neoantigens that generate an immune response, DNA samples from patients with cutaneous melanoma in different stages were obtained, resulting in a total of 6048 potential neoantigens gathered. Thereafter, the immunological responses generated by some of those neoantigens ex vivo were tested, using a vaccine designed by a new optimization approach and encapsulated in nanoparticles. Our bioinformatic analysis indicated that no differences were found between the number of neoantigens and that of non-mutated sequences detected as potential binders by IEDB tools. However, those tools were able to highlight neoantigens over non-mutated peptides in HLA-II recognition (p-value 0.03). However, neither HLA-I binding affinity (p-value 0.08) nor Class I immunogenicity values (p-value 0.96) indicated significant differences for the latter parameters. Subsequently, the new vaccine, using aggregative functions and combinatorial optimization, was designed. The six best neoantigens were selected and formulated into two nanoparticles, with which the immune response ex vivo was evaluated, demonstrating a specific activation of the immune response. This study reinforces the use of bioinformatic tools in vaccine development, as their usefulness is proven both in silico and ex vivo.This work was supported by Basque Government funding (IT456-22; IT1448-22, IT693-22 and IT1524-22; ONKOVAC 2021111042), as well as by the UPV/EHU (GIU20/035; US21/27; US18/21; PIF18/295) and Basque Center of Applied Mathematics (US21/27 and US18/21)

    Contributions to information extraction for spanish written biomedical text

    Get PDF
    285 p.Healthcare practice and clinical research produce vast amounts of digitised, unstructured data in multiple languages that are currently underexploited, despite their potential applications in improving healthcare experiences, supporting trainee education, or enabling biomedical research, for example. To automatically transform those contents into relevant, structured information, advanced Natural Language Processing (NLP) mechanisms are required. In NLP, this task is known as Information Extraction. Our work takes place within this growing field of clinical NLP for the Spanish language, as we tackle three distinct problems. First, we compare several supervised machine learning approaches to the problem of sensitive data detection and classification. Specifically, we study the different approaches and their transferability in two corpora, one synthetic and the other authentic. Second, we present and evaluate UMLSmapper, a knowledge-intensive system for biomedical term identification based on the UMLS Metathesaurus. This system recognises and codifies terms without relying on annotated data nor external Named Entity Recognition tools. Although technically naive, it performs on par with more evolved systems, and does not exhibit a considerable deviation from other approaches that rely on oracle terms. Finally, we present and exploit a new corpus of real health records manually annotated with negation and uncertainty information: NUBes. This corpus is the basis for two sets of experiments, one on cue andscope detection, and the other on assertion classification. Throughout the thesis, we apply and compare techniques of varying levels of sophistication and novelty, which reflects the rapid advancement of the field

    Plataforma de supercomputación para bioinformática

    Get PDF
    En el año 2007 la Universidad de Málaga amplió y trasladó sus recursos de cálculo a un nuevo centro dedicado exclusivamente a la investigación: el edificio de Supercomputación y Bioinnovación sito en el Parque Tecnológico de Andalucía. Este edificio albergaría también la Plataforma Andaluza de Bioinformática junto con otras unidades y laboratorios con instrumentación muy especializada. Desde aquel momento he trabajado como administrador de los recursos de supercomputación del centro y como parte del equipo bioinformático para proporcionar soporte a un gran número de investigadores en sus tareas diarias. Teniendo una visión de ambas partes, fue fácil detectar las carencias existentes en la bioinformática que podían ser cubiertas con una aplicación adecuada de los recursos de cálculo disponibles, y ahí es donde surgió la semilla que nos llevó a comenzar los primeros trabajos que componen este estudio. Al haberse realizado en un entorno tan orientado a la resolución de problemas como el que hemos descrito, esta tesis tendrá un carácter eminentemente práctico, donde cada aportación realizada lleva un importante estudio teórico detrás, pero que culmina en un resultado práctico concreto que puede aplicarse a problemas cotidianos de la bioinformática o incluso de otras áreas de la investigación. Así, con el objetivo de facilitar el acceso a los recursos de supercomputación para los bioinformáticos, hemos creado un generador automático de interfaces web para programas que se ejecutan en línea de comandos, que permite ejecutar los trabajos utilizando recursos de supercomputación de forma transparente para el usuario. Además aportamos un sistema de escritorios virtuales que permiten el acceso remoto a un conjunto de programas ya instalados que proporcionan interfaces visuales para analizar pequeños conjuntos de datos o visualizar los resultados más complejos que hayan sido generados con recursos de supercomputación. Para optimizar el uso de los recursos de supercomputación hemos diseñado un nuevo algoritmo para la ejecución distribuida de tareas, que puede utilizarse tanto en el diseño de nuevas herramientas como para optimizar la ejecución de programas ya existentes. Por otra parte, preocupados por el incremento en la cantidad de datos producidos por las técnicas de ultrasecuenciación, aportamos un nuevo formato de compresión de secuencias, que además de reducir el espacio de almacenamiento utilizado, permite buscar y extraer rápidamente cualquier secuencia almacenada sin necesidad de descomprimir el archivo completo. En el desarrollo de nuevos algoritmos para resolver problemas biológicos concretos, proporcionamos cuatro herramientas nuevas que abarcan la búsqueda de regiones divergentes en alineamientos, el preprocesamiento y limpieza de lecturas obtenidas mediante técnicas de ultrasecuenciación, el análisis de transcriptomas de especies no modelo obtenidos mediante ensamblajes de novo y un prototipo para anotar secuencias genómicas incompletas. Como solución para la difusión y el almacenamiento a largo plazo de resultados obtenidos en diversas investigaciones, se ha desarrollado un sistema genérico de máquinas virtuales para bases de datos de transcriptómica que ya está siendo utilizado en varios proyectos. Además, con el ánimo de difundir los resultados de nuestro trabajo, todos los algoritmos y herramientas productos de esta tesis se han publicado como código abierto en https://github.com/dariogf

    A survey on automated detection and classification of acute leukemia and WBCs in microscopic blood cells

    Full text link
    Leukemia (blood cancer) is an unusual spread of White Blood Cells or Leukocytes (WBCs) in the bone marrow and blood. Pathologists can diagnose leukemia by looking at a person's blood sample under a microscope. They identify and categorize leukemia by counting various blood cells and morphological features. This technique is time-consuming for the prediction of leukemia. The pathologist's professional skills and experiences may be affecting this procedure, too. In computer vision, traditional machine learning and deep learning techniques are practical roadmaps that increase the accuracy and speed in diagnosing and classifying medical images such as microscopic blood cells. This paper provides a comprehensive analysis of the detection and classification of acute leukemia and WBCs in the microscopic blood cells. First, we have divided the previous works into six categories based on the output of the models. Then, we describe various steps of detection and classification of acute leukemia and WBCs, including Data Augmentation, Preprocessing, Segmentation, Feature Extraction, Feature Selection (Reduction), Classification, and focus on classification step in the methods. Finally, we divide automated detection and classification of acute leukemia and WBCs into three categories, including traditional, Deep Neural Network (DNN), and mixture (traditional and DNN) methods based on the type of classifier in the classification step and analyze them. The results of this study show that in the diagnosis and classification of acute leukemia and WBCs, the Support Vector Machine (SVM) classifier in traditional machine learning models and Convolutional Neural Network (CNN) classifier in deep learning models have widely employed. The performance metrics of the models that use these classifiers compared to the others model are higher

    Breaking rules: taking Complex Ontology Alignment beyond rule­based approaches

    Get PDF
    Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2021As ontologies are developed in an uncoordinated manner, differences in scope and design compromise interoperability. Ontology matching is critical to address this semantic heterogeneity problem, as it finds correspondences that enable integrating data across the Semantic Web. One of the biggest challenges in this field is that ontology schemas often differ conceptually, and therefore reconciling many real¬world ontology pairs (e.g., in geography or biomedicine) involves establishing complex mappings that contain multiple entities from each ontology. Yet, for the most part, ontology matching algorithms are restricted to finding simple equivalence mappings between ontology entities. This work presents novel algorithms for Complex Ontology Alignment based on Association Rule Mining over a set of shared instances between two ontologies. Its strategy relies on a targeted search for known complex patterns in instance and schema data, reducing the search space. This allows the application of semantic¬based filtering algorithms tailored to each kind of pattern, to select and refine the most relevant mappings. The algorithms were evaluated in OAEI Complex track datasets under two automated approaches: OAEI’s entity¬based approach and a novel element¬overlap–based approach which was developed in the context of this work. The algorithms were able to find mappings spanning eight distinct complex patterns, as well as combinations of patterns through disjunction and conjunction. They were able to efficiently reduce the search space and showed competitive performance results comparing to the State of the Art of complex alignment systems. As for the comparative analysis of evaluation methodologies, the proposed element¬overlap–based evaluation strategy was shown to be more accurate and interpretable than the reference-based automatic alternative, although none of the existing strategies fully address the challenges discussed in the literature. For future work, it would be interesting to extend the algorithms to cover more complex patterns and combine them with lexical approaches

    Conference on Grey Literature and Repositories

    Get PDF

    Jahresbericht Forschung und Transfer 2019

    Get PDF
    Forschungsjahresbericht 2019 der Hochschule Konstanz Technik, Wirtschaft und Gestaltun

    Aceleración de algoritmos de procesamiento de imágenes para el análisis de partículas individuales con microscopia electrónica

    Full text link
    Tesis Doctoral inédita cotutelada por la Masaryk University (República Checa) y la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingeniería Informática. Fecha de Lectura: 24-10-2022Cryogenic Electron Microscopy (Cryo-EM) is a vital field in current structural biology. Unlike X-ray crystallography and Nuclear Magnetic Resonance, it can be used to analyze membrane proteins and other samples with overlapping spectral peaks. However, one of the significant limitations of Cryo-EM is the computational complexity. Modern electron microscopes can produce terabytes of data per single session, from which hundreds of thousands of particles must be extracted and processed to obtain a near-atomic resolution of the original sample. Many existing software solutions use high-Performance Computing (HPC) techniques to bring these computations to the realm of practical usability. The common approach to acceleration is parallelization of the processing, but in praxis, we face many complications, such as problem decomposition, data distribution, load scheduling, balancing, and synchronization. Utilization of various accelerators further complicates the situation, as heterogeneous hardware brings additional caveats, for example, limited portability, under-utilization due to synchronization, and sub-optimal code performance due to missing specialization. This dissertation, structured as a compendium of articles, aims to improve the algorithms used in Cryo-EM, esp. the SPA (Single Particle Analysis). We focus on the single-node performance optimizations, using the techniques either available or developed in the HPC field, such as heterogeneous computing or autotuning, which potentially needs the formulation of novel algorithms. The secondary goal of the dissertation is to identify the limitations of state-of-the-art HPC techniques. Since the Cryo-EM pipeline consists of multiple distinct steps targetting different types of data, there is no single bottleneck to be solved. As such, the presented articles show a holistic approach to performance optimization. First, we give details on the GPU acceleration of the specific programs. The achieved speedup is due to the higher performance of the GPU, adjustments of the original algorithm to it, and application of the novel algorithms. More specifically, we provide implementation details of programs for movie alignment, 2D classification, and 3D reconstruction that have been sped up by order of magnitude compared to their original multi-CPU implementation or sufficiently the be used on-the-fly. In addition to these three programs, multiple other programs from an actively used, open-source software package XMIPP have been accelerated and improved. Second, we discuss our contribution to HPC in the form of autotuning. Autotuning is the ability of software to adapt to a changing environment, i.e., input or executing hardware. Towards that goal, we present cuFFTAdvisor, a tool that proposes and, through autotuning, finds the best configuration of the cuFFT library for given constraints of input size and plan settings. We also introduce a benchmark set of ten autotunable kernels for important computational problems implemented in OpenCL or CUDA, together with the introduction of complex dynamic autotuning to the KTT tool. Third, we propose an image processing framework Umpalumpa, which combines a task-based runtime system, data-centric architecture, and dynamic autotuning. The proposed framework allows for writing complex workflows which automatically use available HW resources and adjust to different HW and data but at the same time are easy to maintainThe project that gave rise to these results received the support of a fellowship from the “la Caixa” Foundation (ID 100010434). The fellowship code is LCF/BQ/DI18/11660021. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 71367

    Análisis y resolución de los problemas asociados al diseño de sistemas de IOT

    Get PDF
    Al momento de diseñar un sistema de IoT, sin importar si se parte desde un sistema existente que trabaja de forma offline, o si se desea crear un sistema desde sus inicios, se presentarán los siguientes desafíos: En primer lugar, los sistemas de IoT pueden estar conformados por una amplia variedad de dispositivos, cada uno utilizando diferentes protocolos de comunicación y medios físicos para el establecimiento de la misma. Además, los dispositivos podrían encontrarse en ubicaciones geográficas muy distantes, en las que estén regidos por diferentes sistemas legales, y en las cuales la estructura de costos asociada a la conectividad entre los mismos sea muy diferente. Por otra parte, la selección del hardware asociado a cada dispositivo puede variar dependiendo de los riesgos asociados a la actividad en la que se los involucre; de los costos asociados a la adquisición, instalación y mantenimiento en la región geográfica donde se los despliegue; de los protocolos de comunicación que se deseen utilizar; del nivel de calidad deseada en el desempeño de cada dispositivo; y de otros factores técnicos o comerciales. La selección de las tecnologías de Software a utilizar en cada dispositivo podría depender de factores similares a aquellos mencionados en la selección del hardware. Además de estudiar las necesidades particulares de cada dispositivo, debe analizarse la arquitectura general del sistema de IoT. Esta arquitectura debe contemplar las diferentes formas de conectar a los dispositivos entre sí; las jerarquías de dispositivos; los servidores Web involucrados; los proveedores de servicios que serán contratados; los medios de almacenamiento, procesamiento y publicación de la información; las personas involucradas y los demás componentes internos o externos que interactúan en el sistema. Todas las consideraciones mencionadas previamente deben realizarse dentro de un marco de trabajo que garantice la privacidad y seguridad de la información tratada. Es por ello que en algunas regiones geográficas se han establecido diferentes legislaciones asociadas al tema, las cuales deben ser consideradas desde el comienzo del diseño del sistema de IoT. No obstante, si las reglas establecidas en las legislaciones no fueran lo suficientemente claras o completas (o incluso, inexistentes), pueden tomarse como fundamentos los estándares internacionales sobre privacidad y seguridad de los datos, en hardware y software. En este artículo, se presenta una línea de investigación que aborda el Análisis y Resolución de los Problemas Asociados al Diseño de Sistemas de IoT.Red de Universidades con Carreras en Informátic
    corecore