4,072 research outputs found

    Representing and extracting knowledge from single cell data

    Full text link
    Single-cell analysis is currently one of the most high-resolution techniques to study biology. The large complex datasets that have been generated have spurred numerous developments in computational biology, in particular the use of advanced statistics and machine learning. This review attempts to explain the deeper theoretical concepts that underpin current state-of-the-art analysis methods. Single-cell analysis is covered from cell, through instruments, to current and upcoming models. A minimum of mathematics and statistics has been used, but the reader is assumed to either have basic knowledge of single-cell analysis workflows, or have a solid knowledge of statistics. The aim of this review is to spread concepts which are not yet in common use, especially from topology and generative processes, and how new statistical models can be developed to capture more of biology. This opens epistemological questions regarding our ontology and models, and some pointers will be given to how natural language processing (NLP) may help overcome our cognitive limitations for understanding single-cell data

    Study of the tracking performance of a liquid Argon detector based on a novel optical imaging concept

    Get PDF
    The Deep Underground Neutrino Experiment (DUNE) is a long-baseline accelerator experiment designed to make a significant contribution to the study of neutrino oscillations with unprecedented sensitivity. The main goal of DUNE is the determination of the neutrino mass ordering and the leptonic CP violation phase, key parameters of the three-neutrino flavor mixing that have yet to be determined. An important component of the DUNE Near Detector complex is the System for on-Axis Neutrino Detection (SAND) apparatus, which will include GRAIN (GRanular Argon for Interactions of Neutrinos), a novel liquid Argon detector aimed at imaging neutrino interactions using only scintillation light. For this purpose, an innovative optical readout system based on Coded Aperture Masks is investigated. This dissertation aims to demonstrate the feasibility of reconstructing particle tracks and the topology of CCQE (Charged Current Quasi Elastic) neutrino events in GRAIN with such a technique. To this end, the development and implementation of a reconstruction algorithm based on Maximum Likelihood Expectation Maximization was carried out to directly obtain a three-dimensional distribution proportional to the energy deposited by charged particles crossing the LAr volume. This study includes the evaluation of the design of several camera configurations and the simulation of a multi-camera optical system in GRAIN

    Tools for efficient Deep Learning

    Get PDF
    In the era of Deep Learning (DL), there is a fast-growing demand for building and deploying Deep Neural Networks (DNNs) on various platforms. This thesis proposes five tools to address the challenges for designing DNNs that are efficient in time, in resources and in power consumption. We first present Aegis and SPGC to address the challenges in improving the memory efficiency of DL training and inference. Aegis makes mixed precision training (MPT) stabler by layer-wise gradient scaling. Empirical experiments show that Aegis can improve MPT accuracy by at most 4\%. SPGC focuses on structured pruning: replacing standard convolution with group convolution (GConv) to avoid irregular sparsity. SPGC formulates GConv pruning as a channel permutation problem and proposes a novel heuristic polynomial-time algorithm. Common DNNs pruned by SPGC have maximally 1\% higher accuracy than prior work. This thesis also addresses the challenges lying in the gap between DNN descriptions and executables by Polygeist for software and POLSCA for hardware. Many novel techniques, e.g. statement splitting and memory partitioning, are explored and used to expand polyhedral optimisation. Polygeist can speed up software execution in sequential and parallel by 2.53 and 9.47 times on Polybench/C. POLSCA achieves 1.5 times speedup over hardware designs directly generated from high-level synthesis on Polybench/C. Moreover, this thesis presents Deacon, a framework that generates FPGA-based DNN accelerators of streaming architectures with advanced pipelining techniques to address the challenges from heterogeneous convolution and residual connections. Deacon provides fine-grained pipelining, graph-level optimisation, and heuristic exploration by graph colouring. Compared with prior designs, Deacon shows resource/power consumption efficiency improvement of 1.2x/3.5x for MobileNets and 1.0x/2.8x for SqueezeNets. All these tools are open source, some of which have already gained public engagement. We believe they can make efficient deep learning applications easier to build and deploy.Open Acces

    Specificity of the innate immune responses to different classes of non-tuberculous mycobacteria

    Get PDF
    Mycobacterium avium is the most common nontuberculous mycobacterium (NTM) species causing infectious disease. Here, we characterized a M. avium infection model in zebrafish larvae, and compared it to M. marinum infection, a model of tuberculosis. M. avium bacteria are efficiently phagocytosed and frequently induce granuloma-like structures in zebrafish larvae. Although macrophages can respond to both mycobacterial infections, their migration speed is faster in infections caused by M. marinum. Tlr2 is conservatively involved in most aspects of the defense against both mycobacterial infections. However, Tlr2 has a function in the migration speed of macrophages and neutrophils to infection sites with M. marinum that is not observed with M. avium. Using RNAseq analysis, we found a distinct transcriptome response in cytokine-cytokine receptor interaction for M. avium and M. marinum infection. In addition, we found differences in gene expression in metabolic pathways, phagosome formation, matrix remodeling, and apoptosis in response to these mycobacterial infections. In conclusion, we characterized a new M. avium infection model in zebrafish that can be further used in studying pathological mechanisms for NTM-caused diseases

    Valorisation of unconventional lignocellulosic biomass into bioenergy and bioproducts using ionic-liquid based technologies

    Get PDF
    In order to transition energy use away from fossil fuels, the transformation of renewable lignocellulosic biomass needs to be improved and its supply of sustainable bioenergy into a wider bioeconomy increased. Achieving this will require: (1) renewable, cheap and high-quality lignocellulosic feedstocks, (2) a robust process to transform this feedstock into bioenergy and finally (3) a system to recover and use this bioenergy as well as value-added products. In this PhD, multiple examples were used to explore unconventional feedstocks, novel transformation processes and opportunities for value-added products. Feedstocks ranging from invasive species threatening the UK environment, Rhododendron and Japanese Knotweed (Chapter 3), metal contaminated waste wood (Chapter 4), metal enriched biomass grown on marginal land (Chapter 5), and wastewater irrigated willow, a leading dedicated bioenergy crop (Chapter 6). While conventional bioenergy systems often burn wood pellets for energy co-generation, the innovative transformation process of ionic liquid-based technologies are explored as flexible enough to fractionate unconventional biomass feedstocks and deliver high yields of sustainable bioenergy and bioproducts. This was allowed by the unique and tuneable properties of protic ionic liquids. Here dimethylbutyl-hydrogen sulphate - [DMBA][HSO4], a cheap hydrogen sulphate [HSO4]- based ionic liquid, and 1-methylimidazolium chloride - [C1Him][Cl], were used in the ionic-liquid based ionoSolv process. Key efficiency parameters such as temperature, reaction time, biomass to solvent loading and solvent recycling, were explored. The process was also challenged with the presence of diverse metal contamination to determine the potential to extract the metals and produce a fermentable pulp and lignin in parallel. Bioenergy recovery from the ionoSolv process was explored as well as the potential to recover multiple value-added products. In addition to determination of heating values of isolated lignin as well as hydrolysis and fermentation yields of cellulose rich pulps into bioethanol, interactions of contaminating metals and their impact on yeast fermentation yields were investigated. This investigation highlights the benefits of [C1Him][Cl] ionic liquid pretreatment for the production of clean bioenergy and bioproducts from highly contaminated feedstocks. As an important property of ionic liquids is that they can act as media for electrochemical reactions, electrodeposition of metals from ionic liquid liquor, metal extraction efficiencies and any detrimental interactions with ionic liquid recycling were assessed. To further diversify system outputs beyond bioenergy alone, production of bio-oils, char and gases from pyrolysis of post-hydrolysis residue was determined, as well as the possibility for recovery of phytochemicals as potential complementary value-added products. This research highlights that unconventional feedstocks have the potential to support a developing bioeconomy and that the reallocation of waste and reclamation of contaminated soils and waters could act as a financial, social and environmental levers to improve the sustainability of bioenergy production. The studies also showcase how a versatile engineered ionic-liquid pretreatment has the potential to transform environmental burdens into resources that are compatible with the diversification of multiple product streams. Taken together, these findings can serve as a proof-of-concept for integrated scale-up of sustainable ionic-liquid based biorefinery.Open Acces

    Digital agriculture: research, development and innovation in production chains.

    Get PDF
    Digital transformation in the field towards sustainable and smart agriculture. Digital agriculture: definitions and technologies. Agroenvironmental modeling and the digital transformation of agriculture. Geotechnologies in digital agriculture. Scientific computing in agriculture. Computer vision applied to agriculture. Technologies developed in precision agriculture. Information engineering: contributions to digital agriculture. DIPN: a dictionary of the internal proteins nanoenvironments and their potential for transformation into agricultural assets. Applications of bioinformatics in agriculture. Genomics applied to climate change: biotechnology for digital agriculture. Innovation ecosystem in agriculture: Embrapa?s evolution and contributions. The law related to the digitization of agriculture. Innovating communication in the age of digital agriculture. Driving forces for Brazilian agriculture in the next decade: implications for digital agriculture. Challenges, trends and opportunities in digital agriculture in Brazil.Translated by Beverly Victoria Young and Karl Stephan Mokross

    New insights along the gut-liver axis in cardiometabolic disease

    Get PDF
    In this thesis we targeted the human gut microbiome for the development of therapeutic strategies in metabolic disorders. In chapter 3 we performed a randomized placebo-controlled cross-over study in individuals with the metabolic syndrome in which we showed that a single duodenal infusion of A. soehngenii improved peripheral glycemic control. In chapter 4 we studied the effect of a 2 weeks oral A. soehngenii treatment in individuals with T2D treated with metformin on their glycemic control.The second part of the thesis focused on MASLD, currently the most common cause of chronic liver dysfunction worldwide. In chapter 5 we reviewed the gut microbial and gut microbial-derived metabolite signatures associated with the development and disease progression of MASLD. To dissect causality of intestinal microbiota in MASLD, in chapter 6 we performed a single-center, double-blind, randomized controlled proof-of-principle pilot study comparing the effect of three 8-weekly lean vegan donor FMT versus autologous FMT on the severity of MASLD, using liver biopsies in individuals with hepatic steatosis on ultrasound. Moreover, we aimed to identify and validate noninvasive diagnostic methods in disease progression in MASLD. Hence, in chapter 7 we examined the diagnostic performance of multiparametric MRI for the assessment of disease severity along the MASLD disease spectrum with comparison to histological scores

    Towards Authentication of IoMT Devices via RF Signal Classification

    Get PDF
    The increasing reliance on the Internet of Medical Things (IoMT) raises great concern in terms of cybersecurity, either at the device’s physical level or at the communication and transmission level. This is particularly important as these systems process very sensitive and private data, including personal health data from multiple patients such as real-time body measurements. Due to these concerns, cybersecurity mechanisms and strategies must be in place to protect these medical systems, defending them from compromising cyberattacks. Authentication is an essential cybersecurity technique for trustworthy IoMT communications. However, current authentication methods rely on upper-layer identity verification or key-based cryptography which can be inadequate to the heterogeneous Internet of Things (IoT) environments. This thesis proposes the development of a Machine Learning (ML) method that serves as a foundation for Radio Frequency Fingerprinting (RFF) in the authentication of IoMT devices in medical applications to improve the flexibility of such mechanisms. This technique allows the authentication of medical devices by their physical layer characteristics, i.e. of their emitted signal. The development of ML models serves as the foundation for RFF, allowing it to evaluate and categorise the released signal and enable RFF authentication. Multiple feature take part of the proposed decision making process of classifying the device, which then is implemented in a medical gateway, resulting in a novel IoMT technology.A confiança crescente na IoMT suscita grande preocupação em termos de cibersegurança, quer ao nível físico do dispositivo quer ao nível da comunicação e ao nível de transmissão. Isto é particularmente importante, uma vez que estes sistemas processam dados muito sensíveis e dados, incluindo dados pessoais de saúde de diversos pacientes, tais como dados em tempo real de medidas do corpo. Devido a estas preocupações, os mecanismos e estratégias de ciber-segurança devem estar em vigor para proteger estes sistemas médicos, defendendo-os de ciberataques comprometedores. A autenticação é uma técnica essencial de ciber-segurança para garantir as comunicações em sistemas IoMT de confiança. No entanto, os métodos de autenticação atuais focam-se na verificação de identidade na camada superior ou criptografia baseada em chaves que podem ser inadequadas para a ambientes IoMT heterogéneos. Esta tese propõe o desenvolvimento de um método de ML que serve como base para o RFF na autenticação de dispositivos IoMT para melhorar a flexibilidade de tais mecanismos. Isto permite a autenticação dos dispositivos médicos pelas suas características de camada física, ou seja, a partir do seu sinal emitido. O desenvolvimento de modelos de ML serve de base para o RFF, permitindo-lhe avaliar e categorizar o sinal libertado e permitir a autenticação do RFF. Múltiplas features fazem parte do processo de tomada de decisão proposto para classificar o dispositivo, que é implementada num gateway médico, resultando numa nova tecnologia IoMT

    Database System Acceleration on FPGAs

    Get PDF
    Relational database systems provide various services and applications with an efficient means for storing, processing, and retrieving their data. The performance of these systems has a direct impact on the quality of service of the applications that rely on them. Therefore, it is crucial that database systems are able to adapt and grow in tandem with the demands of these applications, ensuring that their performance scales accordingly. In the past, Moore's law and algorithmic advancements have been sufficient to meet these demands. However, with the slowdown of Moore's law, researchers have begun exploring alternative methods, such as application-specific technologies, to satisfy the more challenging performance requirements. One such technology is field-programmable gate arrays (FPGAs), which provide ideal platforms for developing and running custom architectures for accelerating database systems. The goal of this thesis is to develop a domain-specific architecture that can enhance the performance of in-memory database systems when executing analytical queries. Our research is guided by a combination of academic and industrial requirements that seek to strike a balance between generality and performance. The former ensures that our platform can be used to process a diverse range of workloads, while the latter makes it an attractive solution for high-performance use cases. Throughout this thesis, we present the development of a system-on-chip for database system acceleration that meets our requirements. The resulting architecture, called CbMSMK, is capable of processing the projection, sort, aggregation, and equi-join database operators and can also run some complex TPC-H queries. CbMSMK employs a shared sort-merge pipeline for executing all these operators, which results in an efficient use of FPGA resources. This approach enables the instantiation of multiple acceleration cores on the FPGA, allowing it to serve multiple clients simultaneously. CbMSMK can process both arbitrarily deep and wide tables efficiently. The former is achieved through the use of the sort-merge algorithm which utilizes the FPGA RAM for buffering intermediate sort results. The latter is achieved through the use of KeRRaS, a novel variant of the forward radix sort algorithm introduced in this thesis. KeRRaS allows CbMSMK to process a table a few columns at a time, incrementally generating the final result through multiple iterations. Given that acceleration is a key objective of our work, CbMSMK benefits from many performance optimizations. For instance, multi-way merging is employed to reduce the number of merge passes required for the execution of the sort-merge algorithm, thus improving the performance of all our pipeline-breaking operators. Another example is our in-depth analysis of early aggregation, which led to the development of a novel cache-based algorithm that significantly enhances aggregation performance. Our experiments demonstrate that CbMSMK performs on average 5 times faster than the state-of-the-art CPU-based database management system MonetDB.:I Database Systems & FPGAs 1 INTRODUCTION 1.1 Databases & the Importance of Performance 1.2 Accelerators & FPGAs 1.3 Requirements 1.4 Outline & Summary of Contributions 2 BACKGROUND ON DATABASE SYSTEMS 2.1 Databases 2.1.1 Storage Model 2.1.2 Storage Medium 2.2 Database Operators 2.2.1 Projection 2.2.2 Filter 2.2.3 Sort 2.2.4 Aggregation 2.2.5 Join 2.2.6 Operator Classification 2.3 Database Queries 2.4 Impact of Acceleration 3 BACKGROUND ON FPGAS 3.1 FPGA 3.1.1 Logic Element 3.1.2 Block RAM (BRAM) 3.1.3 Digital Signal Processor (DSP) 3.1.4 IO Element 3.1.5 Programmable Interconnect 3.2 FPGADesignFlow 3.2.1 Specifications 3.2.2 RTL Description 3.2.3 Verification 3.2.4 Synthesis, Mapping, Placement, and Routing 3.2.5 TimingAnalysis 3.2.6 Bitstream Generation and FPGA Programming 3.3 Implementation Quality Metrics 3.4 FPGA Cards 3.5 Benefits of Using FPGAs 3.6 Challenges of Using FPGAs 4 RELATED WORK 4.1 Summary of Related Work 4.2 Platform Type 4.2.1 Accelerator Card 4.2.2 Coprocessor 4.2.3 Smart Storage 4.2.4 Network Processor 4.3 Implementation 4.3.1 Loop-based implementation 4.3.2 Sort-based Implementation 4.3.3 Hash-based Implementation 4.3.4 Mixed Implementation 4.4 A Note on Quantitative Performance Comparisons II Cache-Based Morphing Sort-Merge with KeRRaS (CbMSMK) 5 OBJECTIVES AND ARCHITECTURE OVERVIEW 5.1 From Requirements to Objectives 5.2 Architecture Overview 5.3 Outlineof Part II 6 COMPARATIVE ANALYSIS OF OPENCL AND RTL FOR SORT-MERGE PRIMITIVES ON FPGAS 6.1 Programming FPGAs 6.2 RelatedWork 6.3 Architecture 6.3.1 Global Architecture 6.3.2 Sorter Architecture 6.3.3 Merger Architecture 6.3.4 Scalability and Resource Adaptability 6.4 Experiments 6.4.1 OpenCL Sort-Merge Implementation 6.4.2 RTLSorters 6.4.3 RTLMergers 6.4.4 Hybrid OpenCL-RTL Sort-Merge Implementation 6.5 Summary & Discussion 7 RESOURCE-EFFICIENT ACCELERATION OF PIPELINE-BREAKING DATABASE OPERATORS ON FPGAS 7.1 The Case for Resource Efficiency 7.2 Related Work 7.3 Architecture 7.3.1 Sorters 7.3.2 Sort-Network 7.3.3 X:Y Mergers 7.3.4 Merge-Network 7.3.5 Join Materialiser (JoinMat) 7.4 Experiments 7.4.1 Experimental Setup 7.4.2 Implementation Description & Tuning 7.4.3 Sort Benchmarks 7.4.4 Aggregation Benchmarks 7.4.5 Join Benchmarks 7. Summary 8 KERRAS: COLUMN-ORIENTED WIDE TABLE PROCESSING ON FPGAS 8.1 The Scope of Database System Accelerators 8.2 Related Work 8.3 Key-Reduce Radix Sort(KeRRaS) 8.3.1 Time Complexity 8.3.2 Space Complexity (Memory Utilization) 8.3.3 Discussion and Optimizations 8.4 Architecture 8.4.1 MSM 8.4.2 MSMK: Extending MSM with KeRRaS 8.4.3 Payload, Aggregation and Join Processing 8.4.4 Limitations 8.5 Experiments 8.5.1 Experimental Setup 8.5.2 Datasets 8.5.3 MSMK vs. MSM 8.5.4 Payload-Less Benchmarks 8.5.5 Payload-Based Benchmarks 8.5.6 Flexibility 8.6 Summary 9 A STUDY OF EARLY AGGREGATION IN DATABASE QUERY PROCESSING ON FPGAS 9.1 Early Aggregation 9.2 Background & Related Work 9.2.1 Sort-Based Early Aggregation 9.2.2 Cache-Based Early Aggregation 9.3 Simulations 9.3.1 Datasets 9.3.2 Metrics 9.3.3 Sort-Based Versus Cache-Based Early Aggregation 9.3.4 Comparison of Set-Associative Caches 9.3.5 Comparison of Cache Structures 9.3.6 Comparison of Replacement Policies 9.3.7 Cache Selection Methodology 9.4 Cache System Architecture 9.4.1 Window Aggregator 9.4.2 Compressor & Hasher 9.4.3 Collision Detector 9.4.4 Collision Resolver 9.4.5 Cache 9.5 Experiments 9.5.1 Experimental Setup 9.5.2 Resource Utilization and Parameter Tuning 9.5.3 Datasets 9.5.4 Benchmarks on Synthetic Data 9.5.5 Benchmarks on Real Data 9.6 Summary 10 THE FULL PICTURE 10.1 System Architecture 10.2 Benchmarks 10.3 Meeting the Objectives III Conclusion 11 SUMMARY AND OUTLOOK ON FUTURE RESEARCH 11.1 Summary 11.2 Future Work BIBLIOGRAPHY LIST OF FIGURES LIST OF TABLE
    corecore