29 research outputs found

    A Streaming High-Throughput Linear Sorter System with Contention Buffering

    Get PDF
    Popular sorting algorithms do not translate well into hardware implementations. Instead, hardware-based solutions like sorting networks, systolic sorters, and linear sorters exploit parallelism to increase sorting efficiency. Linear sorters, built from identical nodes with simple control, have less area and latency than sorting networks, but they are limited in their throughput. We present a system composed of multiple linear sorters acting in parallel to increase overall throughput. Interleaving is used to increase bandwidth and allow sorting of multiple values per clock cycle, and the amount of interleaving and depth of the linear sorters can be adapted to suit specific applications. Contention for available linear sorters in the system is solved through the use of buffers that accumulate conflicting requests, dispatching them in bulk to reduce latency penalties. Implementation of this system into a field programmable gate array (FPGA) results in a speedup of 68 compared to a MicroBlaze processor running quicksort

    Compact GF(2) systemizer and optimized constant-time hardware sorters for Key Generation in Classic McEliece

    Get PDF
    Classic McEliece is a code-based quantum-resistant public-key scheme characterized with relative high encapsulation/decapsulation speed and small cipher- texts, with an in-depth analysis on its security. However, slow key generation with large public key size make it hard for wider applications. Based on this observation, a high-throughput key generator in hardware, is proposed to accelerate the key generation in Classic McEliece based on algorithm-hardware co-design. Meanwhile the storage overhead caused by large-size keys is also minimized. First, compact large-size GF(2) Gauss elimination is presented by adopting naive processing array, singular matrix detection-based early abort, and memory-friendly scheduling strategy. Second, an optimized constant-time hardware sorter is proposed to support regular memory accesses with less comparators and storage. Third, algorithm-level pipeline is enabled for high-throughput processing, allowing for concurrent key generation based on decoupling between data access and computation

    Embedded Machine Learning: Emphasis on Hardware Accelerators and Approximate Computing for Tactile Data Processing

    Get PDF
    Machine Learning (ML) a subset of Artificial Intelligence (AI) is driving the industrial and technological revolution of the present and future. We envision a world with smart devices that are able to mimic human behavior (sense, process, and act) and perform tasks that at one time we thought could only be carried out by humans. The vision is to achieve such a level of intelligence with affordable, power-efficient, and fast hardware platforms. However, embedding machine learning algorithms in many application domains such as the internet of things (IoT), prostheses, robotics, and wearable devices is an ongoing challenge. A challenge that is controlled by the computational complexity of ML algorithms, the performance/availability of hardware platforms, and the application\u2019s budget (power constraint, real-time operation, etc.). In this dissertation, we focus on the design and implementation of efficient ML algorithms to handle the aforementioned challenges. First, we apply Approximate Computing Techniques (ACTs) to reduce the computational complexity of ML algorithms. Then, we design custom Hardware Accelerators to improve the performance of the implementation within a specified budget. Finally, a tactile data processing application is adopted for the validation of the proposed exact and approximate embedded machine learning accelerators. The dissertation starts with the introduction of the various ML algorithms used for tactile data processing. These algorithms are assessed in terms of their computational complexity and the available hardware platforms which could be used for implementation. Afterward, a survey on the existing approximate computing techniques and hardware accelerators design methodologies is presented. Based on the findings of the survey, an approach for applying algorithmic-level ACTs on machine learning algorithms is provided. Then three novel hardware accelerators are proposed: (1) k-Nearest Neighbor (kNN) based on a selection-based sorter, (2) Tensorial Support Vector Machine (TSVM) based on Shallow Neural Networks, and (3) Hybrid Precision Binary Convolution Neural Network (BCNN). The three accelerators offer a real-time classification with monumental reductions in the hardware resources and power consumption compared to existing implementations targeting the same tactile data processing application on FPGA. Moreover, the approximate accelerators maintain a high classification accuracy with a loss of at most 5%

    USSR Space Life Sciences Digest, issue 6

    Get PDF
    This is the sixth issue of NASA's USSR Space Life Sciences Digest. It contains abstracts of 54 papers recently published in Russian language periodicals and bound collections and of 10 new Soviet monographs. Selected abstracts are illustrated with figures and tables from the original. Additional features include a table of Soviet EVAs and information about English translations of Soviet materials available to readers. The topics covered in this issue have been identified as relevant to 26 areas of aerospace medicine and space biology. These areas are adaptation, biospherics, body fluids, botany, cardiovascular and respiratory systems, developmental biology, endocrinology, enzymology, exobiology, genetics, habitability and environment effects, health and medical treatment, hematology, human performance, immunology, life support systems, mathematical modeling, metabolism., microbiology, morphology and cytology, musculoskeletal system, neurophysiology, nutrition, perception, personnel selection, psychology, radiobiology, reproductive biology, and space medicine

    Co-Benefits of Largescale Organic farming On huMan health (BLOOM)::Protocol for a cluster-randomised controlled evaluation of the Andhra Pradesh Community-managed Natural Farming programme in India

    Get PDF
    The BLOOM study (co-Benefits of Largescale Organic farming On huMan health) aims to determine if a government-implemented agroecology programme reduces pesticide exposure and improves dietary diversity in agricultural households. To achieve this aim, a community-based, cluster-randomised controlled evaluation of the Andhra Pradesh Community-managed Natural Farming (APCNF) programme will be conducted in 80 clusters (40 intervention and 40 control) across four districts of Andhra Pradesh state in south India. Approximately 34 households per cluster will be randomly selected for screening and enrolment into the evaluation at baseline. The two primary outcomes, measured 12 months post-baseline assessment, are urinary pesticide metabolites in a 15% random subsample of participants and dietary diversity in all participants. Both primary outcomes will be measured in (1) adult men ≥18 years old, (2) adult women ≥18 years old, and (3) children <38 months old at enrolment. Secondary outcomes measured in the same households include crop yields, household income, adult anthropometry, anaemia, glycaemia, kidney function, musculoskeletal pain, clinical symptoms, depressive symptoms, women’s empowerment, and child growth and development. Analysis will be on an intention-to-treat basis with an a priori secondary analysis to estimate the per-protocol effect of APCNF on the outcomes. The BLOOM study will provide robust evidence of the impact of a large-scale, transformational government-implemented agroecology programme on pesticide exposure and dietary diversity in agricultural households. It will also provide the first evidence of the nutritional, developmental, and health co-benefits of adopting agroecology, inclusive of malnourishment as well as common chronic diseases

    A UNIFIED HARDWARE/SOFTWARE PRIORITY SCHEDULING MODEL FOR GENERAL PURPOSE SYSTEMS

    Get PDF
    Migrating functionality from software to hardware has historically held the promise of enhancing performance through exploiting the inherent parallel nature of hardware. Many early exploratory efforts in repartitioning traditional software based services into hardware were hampered by expensive ASIC development costs. Recent advancements in FPGA technology have made it more economically feasible to explore migrating functionality across the hardware/software boundary. The flexibility of the FPGA fabric and availability of configurable soft IP components has opened the potential to rapidly and economically investigate different hardware/software partitions. Within the real time operating systems community, there has been continued interest in applying hardware/software co-design approaches to address scheduling issues such as latency and jitter. Many hardware based approaches have been reported to reduce the latency of computing the scheduling decision function itself. However continued adherence to classic scheduler invocation mechanisms can still allow variable latencies to creep into the time taken to make the scheduling decision, and ultimately into application timelines. This dissertation explores how hardware/software co-design can be applied past the scheduling decision itself to also reduce the non-predictable delays associated with interrupts and timers. By expanding the window of hardware/software co-design to these invocation mechanisms, we seek to understand if the jitter introduced by classical hardware/software partitionings can be removed from the timelines of critical real time user processes. This dissertation makes a case for resetting the classic boundaries of software thread level scheduling, software timers, hardware timers and interrupts. We show that reworking the boundaries of the scheduling invocation mechanisms helps to rectify the current imbalance of traditional hardware invocation mechanisms (timers and interrupts) and software scheduling policy (operating system scheduler). We re-factor these mechanisms into a unified hardware software priority scheduling model to facilitate improvements in performance, timeliness and determinism in all domains of computing. This dissertation demonstrates and prototypes the creation of a new framework that effects this basic policy change. The advantage of this approach lies within it's ability to unify, simplify and allow for more control within the operating systems scheduling policy

    Realization of a Self-triggered Detector for the Radio Emission of Cosmic Rays

    Get PDF
    When an UHECR enters the atmosphere, it emits a radio pulse which can be used for detection. The main problem of this new detection method is the poor signal to noise level. This thesis describes the development of an adaptive digital filter to remove radio frequency interference (RFI) in real-time, followed by a self trigger to reject transient noise. The system was implemented in FPGA hardware, tested, and commissioned in the frame of the AERA experiment at the Pierre Auger Observatory

    Cooperative Partial Detection for MIMO Relay Networks

    Get PDF
    This paper was submitted by the author prior to final official version. For official version please see http://hdl.handle.net/1911/64372Cooperative communication has recently re-emerged as a possible paradigm shift to realize the promises of the ever increasing wireless communication market; how- ever, there have been few, if any, studies to translate theoretical results into feasi- ble schemes with their particular practical challenges. The multiple-input multiple- output (MIMO) technique is another method that has been recently employed in different standards and protocols, often as an optional scenario, to further improve the reliability and data rate of different wireless communication applications. In this work, we look into possible methods and algorithms for combining these two tech- niques to take advantage of the benefits of both. In this thesis, we will consider methods that consider the limitations of practical solutions, which, to the best of our knowledge, are the first time to be considered in this context. We will present complexity reduction techniques for MIMO systems in cooperative systems. Furthermore, we will present architectures for flexible and configurable MIMO detectors. These architectures could support a range of data rates, modulation orders and numbers of antennas, and therefore, are crucial in the different nodes of cooperative systems. The breadth-first search employed in our realization presents a large opportunity to exploit the parallelism of the FPGA in order to achieve high data rates. Algorithmic modifications to address potential sequential bottlenecks in the traditional bread-first search-based SD are highlighted in the thesis. We will present a novel Cooperative Partial Detection (CPD) approach in MIMO relay channels, where instead of applying the conventional full detection in the relay, the relay performs a partial detection and forwards the detected parts of the message to the destination. We will demonstrate how this approach leads to controlling the complexity in the relay and helping it choose how much it is willing to cooperate based on its available resources. We will discuss the complexity implications of this method, and more importantly, present hardware verification and over-the-air experimentation of CPD using the Wireless Open-access Research Platform (WARP).NSF grants EIA-0321266, CCF-0541363, CNS-0551692, CNS-0619767, EECS-0925942, and CNS-0923479, Nokia, Xilinx, Nokia Siemens Networks, Texas Instruments, and Azimuth Systems

    Characterization and Evaluation of a Novel Tissue Engineered Aortic Heart Valve Construct

    Get PDF
    Tissue engineering holds great promise for treatment of valvular diseases. Scaffolds for engineered heart valves must function immediately after implantation, but must also permit repopulation with autologous host cells and facilitate gradual remodeling. Native aortic heart valves are composed of three layers, i.e. two strong external fibrous layers (ventricularis and fibrosa) separated by a central, highly hydrated spongiosa. The fibrous layers provide strength and resilience while the spongiosa layer facilitates shearing of the external layers. Our working hypothesis is that partially cross-linked collagen scaffolds that closely mimic the layered histo-architecture of the native valve would fulfill these requirements. To test this hypothesis we have developed heart valve-shaped tri-layered constructs based on collagen, the major structural component in natural heart valves. We describe here the development and characterization of two types of scaffolds, namely the fibrous scaffolds prepared from decellularized porcine pericardium and spongiosa scaffolds from elastase-treated decellularized pulmonary arteries. Fibrous scaffolds were cross-linked with penta galloyl glucose (PGG) to control remodeling. In order to assemble the scaffolds into a 3D valve structure and form the tri-layered leaflets, we developed a bio-adhesive consisting of mixtures of bovine serum albumin and glutaraldehyde (BTglue) and an efficient method to reduce aldehyde toxicity. Glued fibrous scaffolds were tested in vitro for biocompatibility (cell culture) and degradation (collagenase and proteinase K digestion). Tri-layered constructs were also tested for in vivo biocompatibility, cell repopulation and calcification. In current studies, we have confirmed that scaffolds glued with BTglue were non-cytotoxic, with living cells spread across the entire surface of the BT-glue test area and cells growing directly on to the glued surfaces. With the long term aim of our studies being to create anatomically correct scaffolds to be used as personalized constructs for heart valve tissue engineering, we created silicone molds from porcine aortic heart valves and then modeled decellularized porcine pericardium into anatomically correct scaffolds. After drying them in their molds, the scaffolds have acquired the shape of the aortic valve which could then be preserved by exposure to PGG. After inserting decellularized pulmonary artery between the fibrous scaffolds to mimic the spongiosa layer, functionality testing of the heart valve-shaped scaffolds in a custom-made bioreactor showed good leaflet coaptation upon closure and good opening characteristics. Stem cell-seeded scaffolds also showed cellular differentiation into valvular interstitial-like cells (VICs) in similar bioreactor studies. Future studies are needed to perfect the assembly process of the tri-layered construct. Additionally, further evaluation of stem cell differentiation is needed to confirm the presence of VICs in the aortic valve. If successful, there is potential that this approach of layering collagenous scaffolds into tri-layered constructs that mimic the native structure of the native aortic heart valve holds promise for the future of heart valve tissue engineering
    corecore