6,652 research outputs found

    Application of new probabilistic graphical models in the genetic regulatory networks studies

    Get PDF
    This paper introduces two new probabilistic graphical models for reconstruction of genetic regulatory networks using DNA microarray data. One is an Independence Graph (IG) model with either a forward or a backward search algorithm and the other one is a Gaussian Network (GN) model with a novel greedy search method. The performances of both models were evaluated on four MAPK pathways in yeast and three simulated data sets. Generally, an IG model provides a sparse graph but a GN model produces a dense graph where more information about gene-gene interactions is preserved. Additionally, we found two key limitations in the prediction of genetic regulatory networks using DNA microarray data, the first is the sufficiency of sample size and the second is the complexity of network structures may not be captured without additional data at the protein level. Those limitations are present in all prediction methods which used only DNA microarray data.Comment: 38 pages, 3 figure

    Towards Improved Homomorphic Encryption for Privacy-Preserving Deep Learning

    Get PDF
    Mención Internacional en el título de doctorDeep Learning (DL) has supposed a remarkable transformation for many fields, heralded by some as a new technological revolution. The advent of large scale models has increased the demands for data and computing platforms, for which cloud computing has become the go-to solution. However, the permeability of DL and cloud computing are reduced in privacy-enforcing areas that deal with sensitive data. These areas imperatively call for privacy-enhancing technologies that enable responsible, ethical, and privacy-compliant use of data in potentially hostile environments. To this end, the cryptography community has addressed these concerns with what is known as Privacy-Preserving Computation Techniques (PPCTs), a set of tools that enable privacy-enhancing protocols where cleartext access to information is no longer tenable. Of these techniques, Homomorphic Encryption (HE) stands out for its ability to perform operations over encrypted data without compromising data confidentiality or privacy. However, despite its promise, HE is still a relatively nascent solution with efficiency and usability limitations. Improving the efficiency of HE has been a longstanding challenge in the field of cryptography, and with improvements, the complexity of the techniques has increased, especially for non-experts. In this thesis, we address the problem of the complexity of HE when applied to DL. We begin by systematizing existing knowledge in the field through an in-depth analysis of state-of-the-art for privacy-preserving deep learning, identifying key trends, research gaps, and issues associated with current approaches. One such identified gap lies in the necessity for using vectorized algorithms with Packed Homomorphic Encryption (PaHE), a state-of-the-art technique to reduce the overhead of HE in complex areas. This thesis comprehensively analyzes existing algorithms and proposes new ones for using DL with PaHE, presenting a formal analysis and usage guidelines for their implementation. Parameter selection of HE schemes is another recurring challenge in the literature, given that it plays a critical role in determining not only the security of the instantiation but also the precision, performance, and degree of security of the scheme. To address this challenge, this thesis proposes a novel system combining fuzzy logic with linear programming tasks to produce secure parametrizations based on high-level user input arguments without requiring low-level knowledge of the underlying primitives. Finally, this thesis describes HEFactory, a symbolic execution compiler designed to streamline the process of producing HE code and integrating it with Python. HEFactory implements the previous proposals presented in this thesis in an easy-to-use tool. It provides a unique architecture that layers the challenges associated with HE and produces simplified operations interpretable by low-level HE libraries. HEFactory significantly reduces the overall complexity to code DL applications using HE, resulting in an 80% length reduction from expert-written code while maintaining equivalent accuracy and efficiency.El aprendizaje profundo ha supuesto una notable transformación para muchos campos que algunos han calificado como una nueva revolución tecnológica. La aparición de modelos masivos ha aumentado la demanda de datos y plataformas informáticas, para lo cual, la computación en la nube se ha convertido en la solución a la que recurrir. Sin embargo, la permeabilidad del aprendizaje profundo y la computación en la nube se reduce en los ámbitos de la privacidad que manejan con datos sensibles. Estas áreas exigen imperativamente el uso de tecnologías de mejora de la privacidad que permitan un uso responsable, ético y respetuoso con la privacidad de los datos en entornos potencialmente hostiles. Con este fin, la comunidad criptográfica ha abordado estas preocupaciones con las denominadas técnicas de la preservación de la privacidad en el cómputo, un conjunto de herramientas que permiten protocolos de mejora de la privacidad donde el acceso a la información en texto claro ya no es sostenible. Entre estas técnicas, el cifrado homomórfico destaca por su capacidad para realizar operaciones sobre datos cifrados sin comprometer la confidencialidad o privacidad de la información. Sin embargo, a pesar de lo prometedor de esta técnica, sigue siendo una solución relativamente incipiente con limitaciones de eficiencia y usabilidad. La mejora de la eficiencia del cifrado homomórfico en la criptografía ha sido todo un reto, y, con las mejoras, la complejidad de las técnicas ha aumentado, especialmente para los usuarios no expertos. En esta tesis, abordamos el problema de la complejidad del cifrado homomórfico cuando se aplica al aprendizaje profundo. Comenzamos sistematizando el conocimiento existente en el campo a través de un análisis exhaustivo del estado del arte para el aprendizaje profundo que preserva la privacidad, identificando las tendencias clave, las lagunas de investigación y los problemas asociados con los enfoques actuales. Una de las lagunas identificadas radica en el uso de algoritmos vectorizados con cifrado homomórfico empaquetado, que es una técnica del estado del arte que reduce el coste del cifrado homomórfico en áreas complejas. Esta tesis analiza exhaustivamente los algoritmos existentes y propone nuevos algoritmos para el uso de aprendizaje profundo utilizando cifrado homomórfico empaquetado, presentando un análisis formal y unas pautas de uso para su implementación. La selección de parámetros de los esquemas del cifrado homomórfico es otro reto recurrente en la literatura, dado que juega un papel crítico a la hora de determinar no sólo la seguridad de la instanciación, sino también la precisión, el rendimiento y el grado de seguridad del esquema. Para abordar este reto, esta tesis propone un sistema innovador que combina la lógica difusa con tareas de programación lineal para producir parametrizaciones seguras basadas en argumentos de entrada de alto nivel sin requerir conocimientos de bajo nivel de las primitivas subyacentes. Por último, esta tesis propone HEFactory, un compilador de ejecución simbólica diseñado para agilizar el proceso de producción de código de cifrado homomórfico e integrarlo con Python. HEFactory es la culminación de las propuestas presentadas en esta tesis, proporcionando una arquitectura única que estratifica los retos asociados con el cifrado homomórfico, produciendo operaciones simplificadas que pueden ser interpretadas por bibliotecas de bajo nivel. Este enfoque permite a HEFactory reducir significativamente la longitud total del código, lo que supone una reducción del 80% en la complejidad de programación de aplicaciones de aprendizaje profundo que usan cifrado homomórfico en comparación con el código escrito por expertos, manteniendo una precisión equivalente.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidenta: María Isabel González Vasco.- Secretario: David Arroyo Guardeño.- Vocal: Antonis Michala

    A Contextual Approach To Learning Collaborative Behavior Via Observation

    Get PDF
    This dissertation describes a novel technique to creating a simulated team of agents through observation. Simulated human teamwork can be used for a number of purposes, such as expert examples, automated teammates for training purposes and realistic opponents in games and training simulation. Current teamwork simulations require the team member behaviors be programmed into the simulation, often requiring a great deal of time and effort. None are able to observe a team at work and replicate the teamwork behaviors. Machine learning techniques for learning by observation and learning by demonstration have proven successful at observing behavior of humans or other software agents and creating a behavior function for a single agent. The research described here combines current research in teamwork simulations and learning by observation to effectively train a multi-agent system in effective team behavior. The dissertation describes the background and work by others as well as a detailed description of the learning method. A prototype built to evaluate the developed approach as well as the extensive experimentation conducted is also described

    An Algorithmic Framework for Labeling Network Maps

    Full text link
    Drawing network maps automatically comprises two challenging steps, namely laying out the map and placing non-overlapping labels. In this paper we tackle the problem of labeling an already existing network map considering the application of metro maps. We present a flexible and versatile labeling model. Despite its simplicity, we prove that it is NP-complete to label a single line of the network. For a restricted variant of that model, we then introduce an efficient algorithm that optimally labels a single line with respect to a given weighting function. Based on that algorithm, we present a general and sophisticated workflow for multiple metro lines, which is experimentally evaluated on real-world metro maps.Comment: Full version of COCOON 2015 pape

    Machine Learning in Automated Text Categorization

    Full text link
    The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

    Graphics Processing Unit–Enhanced Genetic Algorithms for Solving the Temporal Dynamics of Gene Regulatory Networks

    Get PDF
    Understanding the regulation of gene expression is one of the key problems in current biology. A promising method for that purpose is the determination of the temporal dynamics between known initial and ending network states, by using simple acting rules. The huge amount of rule combinations and the nonlinear inherent nature of the problem make genetic algorithms an excellent candidate for finding optimal solutions. As this is a computationally intensive problem that needs long runtimes in conventional architectures for realistic network sizes, it is fundamental to accelerate this task. In this article, we study how to develop efficient parallel implementations of this method for the fine-grained parallel architecture of graphics processing units (GPUs) using the compute unified device architecture (CUDA) platform. An exhaustive and methodical study of various parallel genetic algorithm schemes—master-slave, island, cellular, and hybrid models, and various individual selection methods (roulette, elitist)—is carried out for this problem. Several procedures that optimize the use of the GPU’s resources are presented. We conclude that the implementation that produces better results (both from the performance and the genetic algorithm fitness perspectives) is simulating a few thousands of individuals grouped in a few islands using elitist selection. This model comprises 2 mighty factors for discovering the best solutions: finding good individuals in a short number of generations, and introducing genetic diversity via a relatively frequent and numerous migration. As a result, we have even found the optimal solution for the analyzed gene regulatory network (GRN). In addition, a comparative study of the performance obtained by the different parallel implementations on GPU versus a sequential application on CPU is carried out. In our tests, a multifold speedup was obtained for our optimized parallel implementation of the method on medium class GPU over an equivalent sequential single-core implementation running on a recent Intel i7 CPU. This work can provide useful guidance to researchers in biology, medicine, or bioinformatics in how to take advantage of the parallelization on massively parallel devices and GPUs to apply novel metaheuristic algorithms powered by nature for real-world applications (like the method to solve the temporal dynamics of GRNs)
    • …
    corecore