29 research outputs found

    Low-power Wearable Healthcare Sensors

    Get PDF
    Advances in technology have produced a range of on-body sensors and smartwatches that can be used to monitor a wearer’s health with the objective to keep the user healthy. However, the real potential of such devices not only lies in monitoring but also in interactive communication with expert-system-based cloud services to offer personalized and real-time healthcare advice that will enable the user to manage their health and, over time, to reduce expensive hospital admissions. To meet this goal, the research challenges for the next generation of wearable healthcare devices include the need to offer a wide range of sensing, computing, communication, and human–computer interaction methods, all within a tiny device with limited resources and electrical power. This Special Issue presents a collection of six papers on a wide range of research developments that highlight the specific challenges in creating the next generation of low-power wearable healthcare sensors

    An efficient implementation of lattice-ladder multilayer perceptrons in field programmable gate arrays

    Get PDF
    The implementation efficiency of electronic systems is a combination of conflicting requirements, as increasing volumes of computations, accelerating the exchange of data, at the same time increasing energy consumption forcing the researchers not only to optimize the algorithm, but also to quickly implement in a specialized hardware. Therefore in this work, the problem of efficient and straightforward implementation of operating in a real-time electronic intelligent systems on field-programmable gate array (FPGA) is tackled. The object of research is specialized FPGA intellectual property (IP) cores that operate in a real-time. In the thesis the following main aspects of the research object are investigated: implementation criteria and techniques. The aim of the thesis is to optimize the FPGA implementation process of selected class dynamic artificial neural networks. In order to solve stated problem and reach the goal following main tasks of the thesis are formulated: rationalize the selection of a class of Lattice-Ladder Multi-Layer Perceptron (LLMLP) and its electronic intelligent system test-bed – a speaker dependent Lithuanian speech recognizer, to be created and investigated; develop dedicated technique for implementation of LLMLP class on FPGA that is based on specialized efficiency criteria for a circuitry synthesis; develop and experimentally affirm the efficiency of optimized FPGA IP cores used in Lithuanian speech recognizer. The dissertation contains: introduction, four chapters and general conclusions. The first chapter reveals the fundamental knowledge on computer-aideddesign, artificial neural networks and speech recognition implementation on FPGA. In the second chapter the efficiency criteria and technique of LLMLP IP cores implementation are proposed in order to make multi-objective optimization of throughput, LLMLP complexity and resource utilization. The data flow graphs are applied for optimization of LLMLP computations. The optimized neuron processing element is proposed. The IP cores for features extraction and comparison are developed for Lithuanian speech recognizer and analyzed in third chapter. The fourth chapter is devoted for experimental verification of developed numerous LLMLP IP cores. The experiments of isolated word recognition accuracy and speed for different speakers, signal to noise ratios, features extraction and accelerated comparison methods were performed. The main results of the thesis were published in 12 scientific publications: eight of them were printed in peer-reviewed scientific journals, four of them in a Thomson Reuters Web of Science database, four articles – in conference proceedings. The results were presented in 17 scientific conferences

    CIRCUITS AND ARCHITECTURE FOR BIO-INSPIRED AI ACCELERATORS

    Get PDF
    Technological advances in microelectronics envisioned through Moore’s law have led to powerful processors that can handle complex and computationally intensive tasks. Nonetheless, these advancements through technology scaling have come at an unfavorable cost of significantly larger power consumption, which has posed challenges for data processing centers and computers at scale. Moreover, with the emergence of mobile computing platforms constrained by power and bandwidth for distributed computing, the necessity for more energy-efficient scalable local processing has become more significant. Unconventional Compute-in-Memory architectures such as the analog winner-takes-all associative-memory and the Charge-Injection Device processor have been proposed as alternatives. Unconventional charge-based computation has been employed for neural network accelerators in the past, where impressive energy efficiency per operation has been attained in 1-bit vector-vector multiplications, and in recent work, multi-bit vector-vector multiplications. In the latter, computation was carried out by counting quanta of charge at the thermal noise limit, using packets of about 1000 electrons. These systems are neither analog nor digital in the traditional sense but employ mixed-signal circuits to count the packets of charge and hence we call them Quasi-Digital. By amortizing the energy costs of the mixed-signal encoding/decoding over compute-vectors with many elements, high energy efficiencies can be achieved. In this dissertation, I present a design framework for AI accelerators using scalable compute-in-memory architectures. On the device level, two primitive elements are designed and characterized as target computational technologies: (i) a multilevel non-volatile cell and (ii) a pseudo Dynamic Random-Access Memory (pseudo-DRAM) bit-cell. At the level of circuit description, compute-in-memory crossbars and mixed-signal circuits were designed, allowing seamless connectivity to digital controllers. At the level of data representation, both binary and stochastic-unary coding are used to compute Vector-Vector Multiplications (VMMs) at the array level. Finally, on the architectural level, two AI accelerator for data-center processing and edge computing are discussed. Both designs are scalable multi-core Systems-on-Chip (SoCs), where vector-processor arrays are tiled on a 2-layer Network-on-Chip (NoC), enabling neighbor communication and flexible compute vs. memory trade-off. General purpose Arm/RISCV co-processors provide adequate bootstrapping and system-housekeeping and a high-speed interface fabric facilitates Input/Output to main memory

    Towards Posture and Gait Evaluation through Wearable-Based Biofeedback Technologies

    Get PDF
    In medicine and sport science, postural evaluation is an essential part of gait and posture correction. There are various instruments for quantifying the postural system’s efficiency and deter- mining postural stability which are considered state-of-the-art. However, such systems present many limitations related to accessibility, economic cost, size, intrusiveness, usability, and time-consuming set-up. To mitigate these limitations, this project aims to verify how wearable devices can be assem- bled and employed to provide feedback to human subjects for gait and posture improvement, which could be applied for sports performance or motor impairment rehabilitation (from neurodegenerative diseases, aging, or injuries). The project is divided into three parts: the first part provides experimen- tal protocols for studying action anticipation and related processes involved in controlling posture and gait based on state-of-the-art instrumentation. The second part provides a biofeedback strategy for these measures concerning the design of a low-cost wearable system. Finally, the third provides al- gorithmic processing of the biofeedback to customize the feedback based on performance conditions, including individual variability. Here, we provide a detailed experimental design that distinguishes significant postural indicators through a conjunct architecture that integrates state-of-the-art postural and gait control instrumentation and a data collection and analysis framework based on low-cost devices and freely accessible machine learning techniques. Preliminary results on 12 subjects showed that the proposed methodology accurately recognized the phases of the defined motor tasks (i.e., rotate, in position, APAs, drop, and recover) with overall F1-scores of 89.6% and 92.4%, respectively, concerning subject-independent and subject-dependent testing setups

    Can my chip behave like my brain?

    Get PDF
    Many decades ago, Carver Mead established the foundations of neuromorphic systems. Neuromorphic systems are analog circuits that emulate biology. These circuits utilize subthreshold dynamics of CMOS transistors to mimic the behavior of neurons. The objective is to not only simulate the human brain, but also to build useful applications using these bio-inspired circuits for ultra low power speech processing, image processing, and robotics. This can be achieved using reconfigurable hardware, like field programmable analog arrays (FPAAs), which enable configuring different applications on a cross platform system. As digital systems saturate in terms of power efficiency, this alternate approach has the potential to improve computational efficiency by approximately eight orders of magnitude. These systems, which include analog, digital, and neuromorphic elements combine to result in a very powerful reconfigurable processing machine.Ph.D

    Neuromorphic Models of the Amygdala with Applications to Spike Based Computing and Robotics

    Get PDF
    Computational neural simulations do not match the functionality and operation of the brain processes they attempt to model. This gap exists due to both our incomplete understanding of brain function and the technological limitations of computers. Moreover, given that the shrinking of transistors has reached its physical limit, fundamentally different computer paradigms are needed to help bridge this gap. Neuromorphic hardware technologies attempt to abstract the form of brain function to provide a computational solution post-Moore’s Law, and neuromorphic algorithms provide software frameworks to increase biological plausibility within neural models. This dissertation focuses on utilizing neuromorphic frameworks to better understand how the brain processes social and emotional stimuli. It describes the creation of a spiking-neuron computational model of the amygdala, the brain region behind our social interactions, and the simulation of the model using brain-inspired computer hardware, as well as the implementations of other spike-based computations on these hardwares. Although scientists agree that the amygdala is the main component of the social brain, few models exist to explain amygdala function beyond “fight or flight”. This model incorporates neuroscientists’ more nuanced understanding of the amygdala, and is validated by comparing the neural responses measured from the model to responses measured in primate amygdalae under the same experimental conditions. This model will inform future physiological experiments, which will generate deeper neuroscientific insights, which will in turn allow for better neural models. Repeated iteratively, this positive feedback loop in which better models beget better under- standing of biology and vice versa will help close the gap between the computer and the brain. The computer networks and hardware that emerge from this process have the potential to achieve higher computing efficiency, approaching or perhaps surpassing the efficiency of the human brain; provide the foundation for new approaches to artificial intelligence and machine learning within a spike-based computing paradigm; and widen our understanding of brain function

    Computer Science & Technology Series : XIX Argentine Congress of Computer Science. Selected papers

    Get PDF
    CACIC’13 was the nineteenth Congress in the CACIC series. It was organized by the Department of Computer Systems at the CAECE University in Mar del Plata. The Congress included 13 Workshops with 165 accepted papers, 5 Conferences, 3 invited tutorials, different meetings related with Computer Science Education (Professors, PhD students, Curricula) and an International School with 5 courses. CACIC 2013 was organized following the traditional Congress format, with 13 Workshops covering a diversity of dimensions of Computer Science Research. Each topic was supervised by a committee of 3-5 chairs of different Universities. The call for papers attracted a total of 247 submissions. An average of 2.5 review reports were collected for each paper, for a grand total of 676 review reports that involved about 210 different reviewers. A total of 165 full papers, involving 489 authors and 80 Universities, were accepted and 25 of them were selected for this book.Red de Universidades con Carreras en Informática (RedUNCI

    Artificial Intelligence Technology

    Get PDF
    This open access book aims to give our readers a basic outline of today’s research and technology developments on artificial intelligence (AI), help them to have a general understanding of this trend, and familiarize them with the current research hotspots, as well as part of the fundamental and common theories and methodologies that are widely accepted in AI research and application. This book is written in comprehensible and plain language, featuring clearly explained theories and concepts and extensive analysis and examples. Some of the traditional findings are skipped in narration on the premise of a relatively comprehensive introduction to the evolution of artificial intelligence technology. The book provides a detailed elaboration of the basic concepts of AI, machine learning, as well as other relevant topics, including deep learning, deep learning framework, Huawei MindSpore AI development framework, Huawei Atlas computing platform, Huawei AI open platform for smart terminals, and Huawei CLOUD Enterprise Intelligence application platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    Entwurfsraumexploration eng gekoppelter paralleler Rechnerarchitekturen

    Get PDF
    Sievers G. Entwurfsraumexploration eng gekoppelter paralleler Rechnerarchitekturen. Bielefeld: Universität Bielefeld; 2016.Eingebettete mikroelektronische Systeme finden in vielen Bereichen des täglichen Lebens Anwendung. Die Integration von zunehmend mehr Prozessorkernen auf einem einzelnen Mikrochip (On-Chip-Multiprozessor, MPSoC) erlaubt eine Steigerung der Rechenleistung und der Ressourceneffizienz dieser Systeme. In der AG Kognitronik und Sensorik der Universität Bielefeld wird das CoreVA-MPSoC entwickelt, welches ressourceneffiziente VLIW-Prozessorkerne über eine hierarchische Verbindungsstruktur koppelt. Eine enge Kopplung mehrerer Prozessorkerne in einem Cluster ermöglicht hierbei eine breitbandige Kommunikation mit geringer Latenz. Der Hauptbeitrag der vorliegenden Arbeit ist die Entwicklung und Entwurfsraumexploration eines ressourceneffizienten CPU-Clusters für den Einsatz im CoreVA-MPSoC. Eine abstrakte Modellierung der Hardware- und Softwarekomponenten des CPU-Clusters sowie ein hoch automatisierter Entwurfsablauf ermöglichen die schnelle Analyse eines großen Entwurfsraums. Im Rahmen der Entwurfsraumexploration werden verschiedene Topologien, Busstandards und Speicherarchitekturen untersucht. Insbesondere das Zusammenspiel der Hardware-Architektur mit Programmiermodell und Synchronisierung ist evident für eine hohe Ressourceneffizienz und eine gute Ausnutzung der verfügbaren Rechenleistung durch den Anwendungsentwickler. Dazu wird ein an die Hardwarearchitektur angepasstes blockbasiertes Synchronisierungsverfahren vorgestellt. Dieses Verfahren wird von Compilern für die Sprachen StreamIt, C sowie OpenCL verwendet, um Anwendungen auf verschiedenen Konfigurationen des CPU-Clusters abzubilden. Neun repräsentative Streaming-Anwendungen zeigen bei der Abbildung auf einem Cluster mit 16 CPUs eine durchschnittliche Beschleunigung um den Faktor 13,3 gegenüber der Ausführung auf einer CPU. Zudem wird ein eng gekoppelter gemeinsamer L1-Datenspeicher mit mehreren Speicherbänken in den CPU-Cluster integriert, der allen CPUs einen Zugriff mit geringer Latenz erlaubt. Des Weiteren wird die Verwendung verschiedener Instruktionsspeicher und -caches evaluiert sowie der Energiebedarf für Kommunikation und Synchronisierung im CPU-Cluster betrachtet. Es wird in dieser Arbeit gezeigt, dass ein CPU-Cluster mit 16 CPU-Kernen einen guten Kompromiss in Bezug auf den Flächenbedarf der Cluster-Verbindungsstruktur sowie die Leistungsfähigkeit des Clusters darstellt. Ein CPU-Cluster mit 16 2-Slot-VLIW-CPUs und insgesamt 512 kB Speicher besitzt bei einer prototypischen Implementierung in einer 28-nm-FD-SOI-Standardzellenbibliothek einen Flächenbedarf von 2,63 mm². Bei einer Taktfrequenz von 760 MHz liegt die durchschnittliche Leistungsaufnahme bei 440 mW. Eine FPGA-basierte Emulation auf einem Xilinx Virtex-7-FPGA erlaubt die Evaluierung eines CoreVA-MPSoCs mit bis zu 24 CPUs bei einer maximalen Taktfrequenz von bis zu 124 MHz. Als weiteres Anwendungsszenario wird ein CoreVA-MPSoC mit bis zu vier CPUs auf das FPGA des autonomen Miniroboters AMiRo abgebildet

    Artificial Intelligence Technology

    Get PDF
    This open access book aims to give our readers a basic outline of today’s research and technology developments on artificial intelligence (AI), help them to have a general understanding of this trend, and familiarize them with the current research hotspots, as well as part of the fundamental and common theories and methodologies that are widely accepted in AI research and application. This book is written in comprehensible and plain language, featuring clearly explained theories and concepts and extensive analysis and examples. Some of the traditional findings are skipped in narration on the premise of a relatively comprehensive introduction to the evolution of artificial intelligence technology. The book provides a detailed elaboration of the basic concepts of AI, machine learning, as well as other relevant topics, including deep learning, deep learning framework, Huawei MindSpore AI development framework, Huawei Atlas computing platform, Huawei AI open platform for smart terminals, and Huawei CLOUD Enterprise Intelligence application platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence
    corecore