33 research outputs found

    Full Stack Optimization of Transformer Inference: a Survey

    Full text link
    Recent advances in state-of-the-art DNN architecture design have been moving toward Transformer models. These models achieve superior accuracy across a wide range of applications. This trend has been consistent over the past several years since Transformer models were originally introduced. However, the amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate, and this has made their deployment in latency-sensitive applications challenging. As such, there has been an increased focus on making Transformer models more efficient, with methods that range from changing the architecture design, all the way to developing dedicated domain-specific accelerators. In this work, we survey different approaches for efficient Transformer inference, including: (i) analysis and profiling of the bottlenecks in existing Transformer architectures and their similarities and differences with previous convolutional models; (ii) implications of Transformer architecture on hardware, including the impact of non-linear operations such as Layer Normalization, Softmax, and GELU, as well as linear operations, on hardware design; (iii) approaches for optimizing a fixed Transformer architecture; (iv) challenges in finding the right mapping and scheduling of operations for Transformer models; and (v) approaches for optimizing Transformer models by adapting the architecture using neural architecture search. Finally, we perform a case study by applying the surveyed optimizations on Gemmini, the open-source, full-stack DNN accelerator generator, and we show how each of these approaches can yield improvements, compared to previous benchmark results on Gemmini. Among other things, we find that a full-stack co-design approach with the aforementioned methods can result in up to 88.7x speedup with a minimal performance degradation for Transformer inference

    Advanced Signal Processing in Wearable Sensors for Health Monitoring

    Get PDF
    Smart, wearables devices on a miniature scale are becoming increasingly widely available, typically in the form of smart watches and other connected devices. Consequently, devices to assist in measurements such as electroencephalography (EEG), electrocardiogram (ECG), electromyography (EMG), blood pressure (BP), photoplethysmography (PPG), heart rhythm, respiration rate, apnoea, and motion detection are becoming more available, and play a significant role in healthcare monitoring. The industry is placing great emphasis on making these devices and technologies available on smart devices such as phones and watches. Such measurements are clinically and scientifically useful for real-time monitoring, long-term care, and diagnosis and therapeutic techniques. However, a pertaining issue is that recorded data are usually noisy, contain many artefacts, and are affected by external factors such as movements and physical conditions. In order to obtain accurate and meaningful indicators, the signal has to be processed and conditioned such that the measurements are accurate and free from noise and disturbances. In this context, many researchers have utilized recent technological advances in wearable sensors and signal processing to develop smart and accurate wearable devices for clinical applications. The processing and analysis of physiological signals is a key issue for these smart wearable devices. Consequently, ongoing work in this field of study includes research on filtration, quality checking, signal transformation and decomposition, feature extraction and, most recently, machine learning-based methods

    Modelado robusto para la extracción de información en entornos biofísicos y críticos

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadores y Automática, leída el 12/07/2018The era of information and Big Data is an environment where multiple devices, always connected, generate huge volumes of information (paradigm of the Internet of Things). This paradigm is present in different areas: the Smart Cities, sport tracking, lifestyle, or health. The goal of this thesis is the development and implementation of a Robust predictive modeling methodology using low cost wearable devices in biophysical and critical scenarios. In this manuscript we present a multilevel architecture that covers from the on-node data processing, up to the data management in Data Centers. The methodology applies energy aware optimization techniques at each level of the network. And the decision system makes use of data from different sources leading to expert decision system...La era de la información y el Big Data, se sustenta en un entorno en el que múltiples dispositivos, siempre conectados, generan ingentes volúmenes de información (paradigma del Internet de las Cosas). Este paradigma ha llegado diversos entornos: las denominadas ciudades inteligentes, monitorización deportiva, estilo de vida, o salud. El objetivo de esta tesis es el desarrollo e implementación de una metodología de modelado predictivo robusto mediante dispositivos wearable de bajo coste en entornos biofísicos y críticos. A lo largo de este manuscrito se presenta una arquitectura multinivel que abarca desde el tratamiento de los datos en los dispositivos sensores hasta el manejo de éstos en centros de datos. La metodología cubre la optimización energética a todos los niveles con consciencia del estado de la red. Y el sistema de decisión hace uso de datos de distintas fuentes para conformar un sistema experto de decisión...Fac. de InformáticaTRUEunpu

    CONNECTIONIST SPEECH RECOGNITION - A Hybrid Approach

    Get PDF

    Improved 3D MR Image Acquisition and Processing in Congenital Heart Disease

    Get PDF
    Congenital heart disease (CHD) is the most common type of birth defect, affecting about 1% of the population. MRI is an essential tool in the assessment of CHD, including diagnosis, intervention planning and follow-up. Three-dimensional MRI can provide particularly rich visualization and information. However, it is often complicated by long scan times, cardiorespiratory motion, injection of contrast agents, and complex and time-consuming postprocessing. This thesis comprises four pieces of work that attempt to respond to some of these challenges. The first piece of work aims to enable fast acquisition of 3D time-resolved cardiac imaging during free breathing. Rapid imaging was achieved using an efficient spiral sequence and a sparse parallel imaging reconstruction. The feasibility of this approach was demonstrated on a population of 10 patients with CHD, and areas of improvement were identified. The second piece of work is an integrated software tool designed to simplify and accelerate the development of machine learning (ML) applications in MRI research. It also exploits the strengths of recently developed ML libraries for efficient MR image reconstruction and processing. The third piece of work aims to reduce contrast dose in contrast-enhanced MR angiography (MRA). This would reduce risks and costs associated with contrast agents. A deep learning-based contrast enhancement technique was developed and shown to improve image quality in real low-dose MRA in a population of 40 children and adults with CHD. The fourth and final piece of work aims to simplify the creation of computational models for hemodynamic assessment of the great arteries. A deep learning technique for 3D segmentation of the aorta and the pulmonary arteries was developed and shown to enable accurate calculation of clinically relevant biomarkers in a population of 10 patients with CHD

    On the automated analysis of preterm infant sleep states from electrocardiography

    Get PDF

    On the automated analysis of preterm infant sleep states from electrocardiography

    Get PDF

    Representation of Somatosensory Afferents in the Cortical Autonomic Network

    Get PDF
    The relationship between somatosensory stimulation and the autonomic nervous system has been established with effects on heart rate (HR) and sympathetic tone. However, the involvement of the cortical autonomic network (CAN) during muscle sensory afferent stimulation has not been identified. The main objective of the research in this dissertation was to determine the representation of somatosensory afferents in the CAN and their physiologic impact on cardiovascular control. Somatosensory afferent activation was elicited by electrical stimulation of type I and II afferents (sub-motor threshold) and type III and IV afferents (motor threshold), and CAN patterns were assessed using blood-oxygenation level-dependent functional magnetic resonance imaging. Study 1 (Chapter 2) established CAN regions associated with sub-motor stimulation including the ventral medial prefrontal cortex (vMPFC), subgenual anterior cingulate cortex (sACC), and posterior insula, along with a trend towards increased heart rate variability (HRV). Motor threshold stimulation was associated with activation in the posterior insula. Having established the CAN regions affected by sensory afferent input, diffusion tensor imaging was used (Chapter 3) to establish structural connections between the cortical regions associated with functional cardiovascular control. We identified two discrete patterns of white matter connectivity between the anterior insula-sACC and posterior insula-posterior cingulate cortex, suggesting that a structural network may underlie functional roles in autonomic regulation and sensory processing. As somatosensory stimulation had modest impact on cardiovascular control under baseline conditions, Study 3 (Chapter 4) aimed to establish the effects of somatosensory stimulation during baroreceptor unloading (lower-body negative pressure, LBNP) on muscle sympathetic nerve activity (MSNA) and cortical activity. Sensory stimulation during LBNP led to an attenuated increase in MSNA burst frequency, as well as absent activity in the right insula and dorsal ACC, supporting the sympatho-excitatory role of these regions. No effect of somatosensory stimulation during chemoreflex-mediated sympatho-excitation was observed on MSNA, while right insular and dorsal ACC activities were maintained. Overall, the results of these studies provide evidence of somatosensory representation within the CAN regions that are anatomically linked, and highlight a role for type I and II sensory afferents in modulating autonomic outflow in a manner that depends upon baroreceptor loading

    Learning Biosignals with Deep Learning

    Get PDF
    The healthcare system, which is ubiquitously recognized as one of the most influential system in society, is facing new challenges since the start of the decade.The myriad of physiological data generated by individuals, namely in the healthcare system, is generating a burden on physicians, losing effectiveness on the collection of patient data. Information systems and, in particular, novel deep learning (DL) algorithms have been prompting a way to take this problem. This thesis has the aim to have an impact in biosignal research and industry by presenting DL solutions that could empower this field. For this purpose an extensive study of how to incorporate and implement Convolutional Neural Networks (CNN), Recursive Neural Networks (RNN) and Fully Connected Networks in biosignal studies is discussed. Different architecture configurations were explored for signal processing and decision making and were implemented in three different scenarios: (1) Biosignal learning and synthesis; (2) Electrocardiogram (ECG) biometric systems, and; (3) Electrocardiogram (ECG) anomaly detection systems. In (1) a RNN-based architecture was able to replicate autonomously three types of biosignals with a high degree of confidence. As for (2) three CNN-based architectures, and a RNN-based architecture (same used in (1)) were used for both biometric identification, reaching values above 90% for electrode-base datasets (Fantasia, ECG-ID and MIT-BIH) and 75% for off-person dataset (CYBHi), and biometric authentication, achieving Equal Error Rates (EER) of near 0% for Fantasia and MIT-BIH and bellow 4% for CYBHi. As for (3) the abstraction of healthy clean the ECG signal and detection of its deviation was made and tested in two different scenarios: presence of noise using autoencoder and fully-connected network (reaching 99% accuracy for binary classification and 71% for multi-class), and; arrhythmia events by including a RNN to the previous architecture (57% accuracy and 61% sensitivity). In sum, these systems are shown to be capable of producing novel results. The incorporation of several AI systems into one could provide to be the next generation of preventive medicine, as the machines have access to different physiological and anatomical states, it could produce more informed solutions for the issues that one may face in the future increasing the performance of autonomous preventing systems that could be used in every-day life in remote places where the access to medicine is limited. These systems will also help the study of the signal behaviour and how they are made in real life context as explainable AI could trigger this perception and link the inner states of a network with the biological traits.O sistema de saúde, que é ubiquamente reconhecido como um dos sistemas mais influentes da sociedade, enfrenta novos desafios desde o ínicio da década. A miríade de dados fisiológicos gerados por indíviduos, nomeadamente no sistema de saúde, está a gerar um fardo para os médicos, perdendo a eficiência no conjunto dos dados do paciente. Os sistemas de informação e, mais espcificamente, da inovação de algoritmos de aprendizagem profunda (DL) têm sido usados na procura de uma solução para este problema. Esta tese tem o objetivo de ter um impacto na pesquisa e na indústria de biosinais, apresentando soluções de DL que poderiam melhorar esta área de investigação. Para esse fim, é discutido um extenso estudo de como incorporar e implementar redes neurais convolucionais (CNN), redes neurais recursivas (RNN) e redes totalmente conectadas para o estudo de biosinais. Diferentes arquiteturas foram exploradas para processamento e tomada de decisão de sinais e foram implementadas em três cenários diferentes: (1) Aprendizagem e síntese de biosinais; (2) sistemas biométricos com o uso de eletrocardiograma (ECG), e; (3) Sistema de detecção de anomalias no ECG. Em (1) uma arquitetura baseada na RNN foi capaz de replicar autonomamente três tipos de sinais biológicos com um alto grau de confiança. Quanto a (2) três arquiteturas baseadas em CNN e uma arquitetura baseada em RNN (a mesma usada em (1)) foram usadas para ambas as identificações, atingindo valores acima de 90 % para conjuntos de dados à base de eletrodos (Fantasia, ECG-ID e MIT -BIH) e 75 % para o conjunto de dados fora da pessoa (CYBHi) e autenticação, atingindo taxas de erro iguais (EER) de quase 0 % para Fantasia e MIT-BIH e abaixo de 4 % para CYBHi. Quanto a (3) a abstração de sinais limpos e assimptomáticos de ECG e a detecção do seu desvio foram feitas e testadas em dois cenários diferentes: na presença de ruído usando um autocodificador e uma rede totalmente conectada (atingindo 99 % de precisão na classificação binária e 71 % na multi-classe), e; eventos de arritmia incluindo um RNN na arquitetura anterior (57 % de precisão e 61 % de sensibilidade). Em suma, esses sistemas são mais uma vez demonstrados como capazes de produzir resultados inovadores. A incorporação de vários sistemas de inteligência artificial em um unico sistema pederá desencadear a próxima geração de medicina preventiva. Os algoritmos ao terem acesso a diferentes estados fisiológicos e anatómicos, podem produzir soluções mais informadas para os problemas que se possam enfrentar no futuro, aumentando o desempenho de sistemas autónomos de prevenção que poderiam ser usados na vida quotidiana, nomeadamente em locais remotos onde o acesso à medicinas é limitado. Estes sistemas também ajudarão o estudo do comportamento do sinal e como eles são feitos no contexto da vida real, pois a IA explicável pode desencadear essa percepção e vincular os estados internos de uma rede às características biológicas
    corecore