112 research outputs found

    Real Time Signal Processing Using Systolic Arrays

    Get PDF
    This thesis discusses and presents the design of systolic arrays used in modern real time signal processing. A methodology to map a given algorithm into a systolized VLSI implementation is described. The architectural alternatives for a given signal processing algorithm are discussed and investigated at a function level using a simulation package that has been developed using the “C” programming language. The similarities and differences between wavefront array processors and systolic array processors are presented

    Hardware Accelerator for MIMO Wireless Systems

    Get PDF
    Ever increasing demand for higher data rates and better Quality of Service (QoS) for a growing number of users requires new transceiver algorithms and architectures to better exploit the available spectrum and to efficiently counter the impairments of the radio channel. Multiple-Input Multiple-Output (MIMO) communication systems employ multiple antennas at both transmitter and at the receiver to meet the requirements of next-generation wireless systems. It is a promising technology to provide increased data rates while not involving an equivalent increase in the spectral requirements. However, practical implementation of MIMO detectors poses a significant challenge and has been consistently identified as the major bottleneck for realizing the full potential that multiple antenna systems promise. Furthermore, in order to make judicious use of the available bandwidth, the baseband units have to dynamically adapt to different modes (modulation schemes, code rates etc) of operations. Flexibility and high throughput requirements often place conflicting demands on the Very Large Scale Integration (VLSI) system designer. The major focus of this dissertation is to present efficient VLSI architectures for configurable MIMO detectors that can serve as accelerators to enable the realization of next generation wireless devices feasible. Both, hard output and soft output detector architectures are considered

    A Comprehensive Methodology for Algorithm Characterization, Regularization and Mapping Into Optimal VLSI Arrays.

    Get PDF
    This dissertation provides a fairly comprehensive treatment of a broad class of algorithms as it pertains to systolic implementation. We describe some formal algorithmic transformations that can be utilized to map regular and some irregular compute-bound algorithms into the best fit time-optimal systolic architectures. The resulted architectures can be one-dimensional, two-dimensional, three-dimensional or nonplanar. The methodology detailed in the dissertation employs, like other methods, the concept of dependence vector to order, in space and time, the index points representing the algorithm. However, by differentiating between two types of dependence vectors, the ordering procedure is allowed to be flexible and time optimal. Furthermore, unlike other methodologies, the approach reported here does not put constraints on the topology or dimensionality of the target architecture. The ordered index points are represented by nodes in a diagram called Systolic Precedence Diagram (SPD). The SPD is a form of precedence graph that takes into account the systolic operation requirements of strictly local communications and regular data flow. Therefore, any algorithm with variable dependence vectors has to be transformed into a regular indexed set of computations with local dependencies. This can be done by replacing variable dependence vectors with sets of fixed dependence vectors. The SPD is transformed into an acyclic, labeled, directed graph called the Systolic Directed Graph (SDG). The SDG models the data flow as well as the timing for the execution of the given algorithm on a time-optimal array. The target architectures are obtained by projecting the SDG along defined directions. If more than one valid projection direction exists, different designs are obtained. The resulting architectures are then evaluated to determine if an improvement in the performance can be achieved by increasing PE fan-out. If so, the methodology provides the corresponding systolic implementation. By employing a new graph transformation, the SDG is manipulated so that it can be mapped into fixed-size and fixed-depth multi-linear arrays. The latter is a new concept of systolic arrays that is adaptable to changes in the state of technology. It promises a bonded clock skew, higher throughput and better performance than the linear implementation

    Mixed integer programming on transputers

    Get PDF
    Mixed Integer Programming (MIP) problems occur in many industries and their practical solution can be challenging in terms of both time and effort. Although faster computer hardware has allowed the solution of more MIP problems in reasonable times, there will come a point when the hardware cannot be speeded up any more. One way of improving the solution times of MIP problems without further speeding up the hardware is to improve the effectiveness of the solution algorithm used. The advent of accessible parallel processing technology and techniques provides the opportunity to exploit any parallelism within MIP solving algorithms in order to accelerate the solution of MIP problems. Many of the MIP problem solving algorithms in the literature contain a degree of exploitable parallelism. Several algorithms were considered as candidates for parallelisation within the constraints imposed by the currently available parallel hardware and techniques. A parallel Branch and Bound algorithm was designed for and implemented on an array of transputers hosted by a PC. The parallel algorithm was designed to operate as a process farm, with a master passing work to various slave processors. A message-passing harness was developed to allow full control of the slaves and the work sent to them. The effects of using various node selection techniques were studied and a default node selection strategy decided upon for the parallel algorithm. The parallel algorithm was also designed to take full advantage of the structure of MIP problems formulated using global entities such as general integers and special ordered sets. The presence of parallel processors makes practicable the idea of performing more than two branches on an unsatisfied global entity. Experiments were carried out using multiway branching strategies and a default branching strategy decided upon for appropriate types of MIP problem

    Comparison of Potential Salmonella Portals of Entry and Tissue Distribution Following Challenge of Poultry

    Get PDF
    The following studies evaluated our hypothesis that transmission by the fecal-respiratory route may be a viable portal of entry for Salmonella and could explain some clinical impressions of relatively low-dose infectivity under field conditions in relation to the requisite high oral challenge dose that is typically required for infection of poultry through the oral route in laboratory studies. Initial field reports indicating tracheal sampling to be a sensitive tool for monitoring Salmonella infection in commercial flocks, suggested that tracheal contamination could be a good indicator of Salmonella infection under commercial conditions. Further, a usual assumption regarding airborne Salmonella reaching the upper respiratory tract, would ultimately involve oral ingestion, due to the presence of the mucociliary clearance was evaluated. Suspension in 1% mucin failed to increase the infectivity at any dose of Salmonella when compared to OR administration without mucin and intratracheal (IT) challenge, which was also recovered from lung tissue. IT administration was more effective or at least as effective at colonizing the ceca of 7d chickens, suggesting that the respiratory tract may be an overlooked potential portal of entry for Salmonellae. Finally, the hypothesis was evaluated through IT administration of Salmonella, in comparison with oral administration. A significantly higher or equivalent cecal recovery of Salmonella, with a clear dose response curve, with the IT groups as compared to groups challenged OR, added further support to the hypothesis. Both the cecal CFU recovery data and organ invasion incidence data from these experiments provided evidence for the subsequent fate of Salmonella exposed to the respiratory system, potentially involving a systemic route. Overall, our data suggests that the respiratory route might be a viable portal of entry for Salmonella in poultry. Clarification of the potential importance of the respiratory tract for Salmonella transmission under field conditions may be of critical importance as efforts to develop intervention strategies to reduce transmission of these pathogens in poultry continue

    Biosensors for Diagnosis and Monitoring

    Get PDF
    Biosensor technologies have received a great amount of interest in recent decades, and this has especially been the case in recent years due to the health alert caused by the COVID-19 pandemic. The sensor platform market has grown in recent decades, and the COVID-19 outbreak has led to an increase in the demand for home diagnostics and point-of-care systems. With the evolution of biosensor technology towards portable platforms with a lower cost on-site analysis and a rapid selective and sensitive response, a larger market has opened up for this technology. The evolution of biosensor systems has the opportunity to change classic analysis towards real-time and in situ detection systems, with platforms such as point-of-care and wearables as well as implantable sensors to decentralize chemical and biological analysis, thus reducing industrial and medical costs. This book is dedicated to all the research related to biosensor technologies. Reviews, perspective articles, and research articles in different biosensing areas such as wearable sensors, point-of-care platforms, and pathogen detection for biomedical applications as well as environmental monitoring will introduce the reader to these relevant topics. This book is aimed at scientists and professionals working in the field of biosensors and also provides essential knowledge for students who want to enter the field

    State-of-the-art Assessment For Simulated Forces

    Get PDF
    Summary of the review of the state of the art in simulated forces conducted to support the research objectives of Research and Development for Intelligent Simulated Forces

    Towards Closing the Programmability-Efficiency Gap using Software-Defined Hardware

    Full text link
    The past decade has seen the breakdown of two important trends in the computing industry: Moore’s law, an observation that the number of transistors in a chip roughly doubles every eighteen months, and Dennard scaling, that enabled the use of these transistors within a constant power budget. This has caused a surge in domain-specific accelerators, i.e. specialized hardware that deliver significantly better energy efficiency than general-purpose processors, such as CPUs. While the performance and efficiency of such accelerators are highly desirable, the fast pace of algorithmic innovation and non-recurring engineering costs have deterred their widespread use, since they are only programmable across a narrow set of applications. This has engendered a programmability-efficiency gap across contemporary platforms. A practical solution that can close this gap is thus lucrative and is likely to engender broad impact in both academic research and the industry. This dissertation proposes such a solution with a reconfigurable Software-Defined Hardware (SDH) system that morphs parts of the hardware on-the-fly to tailor to the requirements of each application phase. This system is designed to deliver near-accelerator-level efficiency across a broad set of applications, while retaining CPU-like programmability. The dissertation first presents a fixed-function solution to accelerate sparse matrix multiplication, which forms the basis of many applications in graph analytics and scientific computing. The solution consists of a tiled hardware architecture, co-designed with the outer product algorithm for Sparse Matrix-Matrix multiplication (SpMM), that uses on-chip memory reconfiguration to accelerate each phase of the algorithm. A proof-of-concept is then presented in the form of a prototyped 40 nm Complimentary Metal-Oxide Semiconductor (CMOS) chip that demonstrates energy efficiency and performance per die area improvements of 12.6x and 17.1x over a high-end CPU, and serves as a stepping stone towards a full SDH system. The next piece of the dissertation enhances the proposed hardware with reconfigurability of the dataflow and resource sharing modes, in order to extend acceleration support to a set of common parallelizable workloads. This reconfigurability lends the system the ability to cater to discrete data access and compute patterns, such as workloads with extensive data sharing and reuse, workloads with limited reuse and streaming access patterns, among others. Moreover, this system incorporates commercial cores and a prototyped software stack for CPU-level programmability. The proposed system is evaluated on a diverse set of compute-bound and memory-bound kernels that compose applications in the domains of graph analytics, machine learning, image and language processing. The evaluation shows average performance and energy-efficiency gains of 5.0x and 18.4x over the CPU. The final part of the dissertation proposes a runtime control framework that uses low-cost monitoring of hardware performance counters to predict the next best configuration and reconfigure the hardware, upon detecting a change in phase or nature of data within the application. In comparison to prior work, this contribution targets multicore CGRAs, uses low-overhead decision tree based predictive models, and incorporates reconfiguration cost-awareness into its policies. Compared to the best-average static (non-reconfiguring) configuration, the dynamically reconfigurable system achieves a 1.6x improvement in performance-per-Watt in the Energy-Efficient mode of operation, or the same performance with 23% lower energy in the Power-Performance mode, for SpMM across a suite of real-world inputs. The proposed reconfiguration mechanism itself outperforms the state-of-the-art approach for dynamic runtime control by up to 2.9x in terms of energy-efficiency.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169859/1/subh_1.pd
    • …
    corecore