293 research outputs found

    Efficient Hardware Architectures for Accelerating Deep Neural Networks: Survey

    Get PDF
    In the modern-day era of technology, a paradigm shift has been witnessed in the areas involving applications of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL). Specifically, Deep Neural Networks (DNNs) have emerged as a popular field of interest in most AI applications such as computer vision, image and video processing, robotics, etc. In the context of developed digital technologies and the availability of authentic data and data handling infrastructure, DNNs have been a credible choice for solving more complex real-life problems. The performance and accuracy of a DNN is a way better than human intelligence in certain situations. However, it is noteworthy that the DNN is computationally too cumbersome in terms of the resources and time to handle these computations. Furthermore, general-purpose architectures like CPUs have issues in handling such computationally intensive algorithms. Therefore, a lot of interest and efforts have been invested by the research fraternity in specialized hardware architectures such as Graphics Processing Unit (GPU), Field Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), and Coarse Grained Reconfigurable Array (CGRA) in the context of effective implementation of computationally intensive algorithms. This paper brings forward the various research works carried out on the development and deployment of DNNs using the aforementioned specialized hardware architectures and embedded AI accelerators. The review discusses the detailed description of the specialized hardware-based accelerators used in the training and/or inference of DNN. A comparative study based on factors like power, area, and throughput, is also made on the various accelerators discussed. Finally, future research and development directions are discussed, such as future trends in DNN implementation on specialized hardware accelerators. This review article is intended to serve as a guide for hardware architectures for accelerating and improving the effectiveness of deep learning research.publishedVersio

    A Framework for Optical Inspection Applications in Life-Science Automation

    Get PDF
    This thesis presents possible applications for Computer Vision based systems in the field of Laboratory Automation and applicable camera-based, multi-camera-based or flatbed scanner based imaging devices. A concept of a software framework for CV applications is developed with respect to hardware compatibility, data processing and user interfaces. An application is implemented using the framework. It aims at the detection of low-volume liquids in microtiter plates, a labware standard. Using this algorithm, it is possible to cover a wide range of labware on different imaging hardware

    Implementazione ed ottimizzazione di algoritmi per l'analisi di Biomedical Big Data

    Get PDF
    Big Data Analytics poses many challenges to the research community who has to handle several computational problems related to the vast amount of data. An increasing interest involves Biomedical data, aiming to get the so-called personalized medicine, where therapy plans are designed on the specific genotype and phenotype of an individual patient and algorithm optimization plays a key role to this purpose. In this work we discuss about several topics related to Biomedical Big Data Analytics, with a special attention to numerical issues and algorithmic solutions related to them. We introduce a novel feature selection algorithm tailored on omics datasets, proving its efficiency on synthetic and real high-throughput genomic datasets. We tested our algorithm against other state-of-art methods obtaining better or comparable results. We also implemented and optimized different types of deep learning models, testing their efficiency on biomedical image processing tasks. Three novel frameworks for deep learning neural network models development are discussed and used to describe the numerical improvements proposed on various topics. In the first implementation we optimize two Super Resolution models showing their results on NMR images and proving their efficiency in generalization tasks without a retraining. The second optimization involves a state-of-art Object Detection neural network architecture, obtaining a significant speedup in computational performance. In the third application we discuss about femur head segmentation problem on CT images using deep learning algorithms. The last section of this work involves the implementation of a novel biomedical database obtained by the harmonization of multiple data sources, that provides network-like relationships between biomedical entities. Data related to diseases and other biological relates were mined using web-scraping methods and a novel natural language processing pipeline was designed to maximize the overlap between the different data sources involved in this project

    Integrating legacy mainframe systems: architectural issues and solutions

    Get PDF
    For more than 30 years, mainframe computers have been the backbone of computing systems throughout the world. Even today it is estimated that some 80% of the worlds' data is held on such machines. However, new business requirements and pressure from evolving technologies, such as the Internet is pushing these existing systems to their limits and they are reaching breaking point. The Banking and Financial Sectors in particular have been relying on mainframes for the longest time to do their business and as a result it is they that feel these pressures the most. In recent years there have been various solutions for enabling a re-engineering of these legacy systems. It quickly became clear that to completely rewrite them was not possible so various integration strategies emerged. Out of these new integration strategies, the CORBA standard by the Object Management Group emerged as the strongest, providing a standards based solution that enabled the mainframe applications become a peer in a distributed computing environment. However, the requirements did not stop there. The mainframe systems were reliable, secure, scalable and fast, so any integration strategy had to ensure that the new distributed systems did not lose any of these benefits. Various patterns or general solutions to the problem of meeting these requirements have arisen and this research looks at applying some of these patterns to mainframe based CORBA applications. The purpose of this research is to examine some of the issues involved with making mainframebased legacy applications inter-operate with newer Object Oriented Technologies

    Extreme scale parallel NBody algorithm with event driven constraint based execution model

    Get PDF
    Traditional scientific applications such as Computational Fluid Dynamics, Partial Differential Equations based numerical methods (like Finite Difference Methods, Finite Element Methods) achieve sufficient efficiency on state of the art high performance computing systems and have been widely studied / implemented using conventional programming models. For emerging application domains such as Graph applications scalability and efficiency is significantly constrained by the conventional systems and their supporting programming models. Furthermore technology trends like multicore, manycore, heterogeneous system architectures are introducing new challenges and possibilities. Emerging technologies are requiring a rethinking of approaches to more effectively expose the underlying parallelism to the applications and the end-users. This thesis explores the space of effective parallel execution of ephemeral graphs that are dynamically generated. The standard particle based simulation, solved using the Barnes-Hut algorithm is chosen to exemplify the dynamic workloads. In this thesis the workloads are expressed using sequential execution semantics, a conventional parallel programming model - shared memory semantics and semantics of an innovative execution model designed for efficient scalable performance towards Exascale computing called ParalleX. The main outcomes of this research are parallel processing of dynamic ephemeral workloads, enabling dynamic load balancing during runtime, and using advanced semantics for exposing parallelism in scaling constrained applications

    Characterisation of a Novel Nuclear Receptor-like Protein

    No full text
    A genome threading algorithm was employed by Inpharmatica to identify a number of proteins with a predicted structure similar to that of the ligand binding domain (LBD) of nuclear receptors (NRs), an approach that has been successfully used to annotate the yeast transcription factor Oaf1 (Phelps et al, 2006). This work focuses on one such protein termed NR3, which is identical to TRPC4AP or TRUSS, a protein proposed to function as a scaffold protein in cell signalling processes and as a cell cycle regulator. It is conceivable that NR3 does not function as a transcription factor in contrast to bona fide NRs and that the putative LBD may function as an allosteric switch to control functional activity. To investigate the idea that NR3 may possess a fold similar to the LBD of NRs preliminary structural work has been undertaken, which has suggested the putative LBD folds into an autonomous domain as it is region resistant to proteolysis. In addition, the potential role of the putative LBD fold as a molecular switch was examined by using constitutively active fusion proteins in reporter gene assays. It was determined that the putative NR3 LBD acts in a repressive manner, potentially due to the alteration in subcellular localisation exerted by the putative NR3 LBD on the fusion protein. To further assess the role of the putative LBD a ligand screen was undertaken to identify compounds that may reduce its repressive activity, however no ligand was identified and it is conceivable NR3 may act in a ligand independent manner similar to some orphan receptors. Initial analysis of NR3 function indicates that its expression may have a positive effect on cell proliferation. To further assess the role of NR3 protein interaction assays were established to screen for binding partners. This identified the E3 ubiquitin ligase component DNA damage-binding protein 1 (DDB1) as an interacting protein involved in the regulation of cell cycle progression and DNA repair. Mapping studies suggest NR3 binds to the substrate docking site of DDB1 and further analysis showed NR3 to be ubiquitinated, affecting the stability of the protein. It is reported that the arylhydrocarbon receptor binds to a DDB1 complex, which then acts as a ligand regulated E3 ubiquitin ligase complex (Ohtake et al, 2007). This raises the possibility that NR3 may act in functionally analogous manner. To address NR3 function within the whole organism a targeting vector designed to inactivate the NR3 gene has been generated and currently a conditional knockout mouse line are being bred

    Inferring the clonal identity of single cells from RNA-seq data with Unique Molecular Identifiers

    Get PDF
    Cancer is an evolutionary disease, in which heterogeneous populations of tumor cells can emerge, proliferate, and disappear depending on selective and neutral processes. This principle has been observed in many studies of acute myeloid leukemia (AML), which is the most common blood cancer in adults. Clonal heterogeneity and evolution have been proposed to play a role in the high relapse rate of this type of cancer. In order to understand this feature, it is crucial to have adequate clinical and experimental models that can provide enough data to elucidate the evolutionary history of a tumor, such as patient-derived xenografts (PDX). These models can be combined with high-resolution sequencing technologies, such as single-cell RNA-seq, to provide a detailed view of the heterogeneity and molecular features of the tumor. However, adequate analytical tools have to be applied and developed in order to fully exploit such datasets. Here I present the analysis of the clonal heterogeneity of an AML patient and the corresponding PDX model, which was treated with multiple rounds of chemotherapy. This model allowed to study the response of the tumor populations to the pressure induced by the therapy, and the possible evolutionary forces behind it. Datasets for these AML samples were generated with multiple types of sequencing methods, one of which was single-cell RNA sequencing. To enable the analysis of somatic mutations and clonal populations in this kind of data, I developed a software package, which is capable of extracting and proofreading variant sequences by making use of Unique Molecular Identifiers (UMIs), which are sequence barcodes that allow to distinguish reads that come from PCR amplification duplicates. The benefits of employing this proofreading approach for variant calling and for inferring the clonal identity of single cells were demonstrated. Finally, I applied to the analysis of the single-cell data of the AML PDX samples that were treated with chemotherapy, as well as other datasets with UMI-based sequencing

    Cloud-based homomorphic encryption for privacy-preserving machine learning in clinical decision support

    Get PDF
    While privacy and security concerns dominate public cloud services, Homomorphic Encryption (HE) is seen as an emerging solution that ensures secure processing of sensitive data via untrusted networks in the public cloud or by third-party cloud vendors. It relies on the fact that some encryption algorithms display the property of homomorphism, which allows them to manipulate data meaningfully while still in encrypted form; although there are major stumbling blocks to overcome before the technology is considered mature for production cloud environments. Such a framework would find particular relevance in Clinical Decision Support (CDS) applications deployed in the public cloud. CDS applications have an important computational and analytical role over confidential healthcare information with the aim of supporting decision-making in clinical practice. Machine Learning (ML) is employed in CDS applications that typically learn and can personalise actions based on individual behaviour. A relatively simple-to-implement, common and consistent framework is sought that can overcome most limitations of Fully Homomorphic Encryption (FHE) in order to offer an expanded and flexible set of HE capabilities. In the absence of a significant breakthrough in FHE efficiency and practical use, it would appear that a solution relying on client interactions is the best known entity for meeting the requirements of private CDS-based computation, so long as security is not significantly compromised. A hybrid solution is introduced, that intersperses limited two-party interactions amongst the main homomorphic computations, allowing exchange of both numerical and logical cryptographic contexts in addition to resolving other major FHE limitations. Interactions involve the use of client-based ciphertext decryptions blinded by data obfuscation techniques, to maintain privacy. This thesis explores the middle ground whereby HE schemes can provide improved and efficient arbitrary computational functionality over a significantly reduced two-party network interaction model involving data obfuscation techniques. This compromise allows for the powerful capabilities of HE to be leveraged, providing a more uniform, flexible and general approach to privacy-preserving system integration, which is suitable for cloud deployment. The proposed platform is uniquely designed to make HE more practical for mainstream clinical application use, equipped with a rich set of capabilities and potentially very complex depth of HE operations. Such a solution would be suitable for the long-term privacy preserving-processing requirements of a cloud-based CDS system, which would typically require complex combinatorial logic, workflow and ML capabilities
    corecore