26 research outputs found

    Design of a Low-Power Automatic Wireless Multi-Logger Networking Device

    Get PDF
    Virtually every industry and discipline (e.g., mining, pharmaceutical, construction, agriculture, reclamation, etc.) is finding applications for wireless data acquisition for monitoring and managing processes and resources. Two sectors, namely agriculture and environmental research, are seeking ways to obtain distributed soil and plant measurements over larger areas like a watershed or large fields rather than a single site of intensive instrumentation (i.e., a weather station). Wireless sensor networks and remote sensing have been explored as a means to satisfy this need. Commercial products are readily available that have remote wireless options to support distributed senor networking. However, these systems have been designed with a field engineer or technician as the target end-user. Equipment and operating costs, device specific programming languages, and complex wireless configuration schemes have impeded the adoption of large-scale, multi-node wireless systems in these fields. This report details the development of an easy-to-use, ultra-low power wireless datalogger incorporating a scalable, intelligent data collection and transmission topology. The final product can interface to various sensor types including SDI-12 and uses an LCD display to help simplify device setup

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

    ACCELERATION OF SPIKING NEURAL NETWORKS ON SINGLE-GPU AND MULTI-GPU SYSTEMS

    Get PDF
    There has been a strong interest in modeling a mammalian brain in order to study the architectural and functional principles of the brain and offer tools to neuroscientists and medical researchers for related studies. Artificial Neural Networks (ANNs) are compute models that try to simulate the structure and/or the functional behavior of neurons and process information using the connectionist approach to computation. Hence, the ANNs are the viable options for such studies. Of many classes of ANNs, Spiking Neuron Network models (SNNs) have been employed to simulate mammalian brain, capturing its functionality and inference capabilities. In this class of neuron models, some of the biologically accurate models are the Hodgkin Huxley (HH) model, Morris Lecar (ML) model, Wilson model, and the Izhikevich model. The HH model is the oldest, most biologically accurate and the most compute intensive of the listed models. The Izhikevich model, a more recent development, is sufficiently accurate and involves the least computations. Accurate modeling of the neurons calls for compute intensive models and hence single core processors are not suitable for large scale SNN simulations due to their serial computation and low memory bandwidth. Graphical Processing Units have been used for general purpose computing as they offer raw computing power, with a majority of logic solely dedicated for computing purpose. The work presented in this thesis implements two-level character recognition networks using the four previously mentioned SNN models in Nvidia\u27s Tesla C870 card and investigates performance improvements over the equivalent software implementation on a 2.66 GHz Intel Core 2 Quad. The work probes some of the important parameters such as the kernel time, memory transfer time and flops offered by the GPU device for the implementations. In this work, we report speed-ups as high as 576x on a single GPU device for the most compute-intensive, highly biologically realistic Hodgkin Huxley model. These results demonstrate the potential of GPUs for large-scale, accurate modeling of the mammalian brain. The research in this thesis also presents several optimization techniques and strategies, and discusses the major bottlenecks that must be avoided in order to achieve maximum performance benefits for applications involving complex computations. The research also investigates an initial multi-GPU implementation to study the problem partitioning for simulating biological-scale neuron networks on a cluster of GPU devices

    The impact of design techniques in the reduction of power consumption of SoCs Multimedia

    Get PDF
    Orientador: Guido Costa Souza de AraújoDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: A indústria de semicondutores sempre enfrentou fortes demandas em resolver problema de dissipação de calor e reduzir o consumo de energia em dispositivos. Esta tendência tem sido intensificada nos últimos anos com o movimento de sustentabilidade ambiental. A concepção correta de um sistema eletrônico de baixo consumo de energia é um problema de vários níveis de complexidade e exige estratégias sistemáticas na sua construção. Fora disso, a adoção de qualquer técnica de redução de energia sempre está vinculada com objetivos especiais e provoca alguns impactos no projeto. Apesar dos projetistas conheçam bem os impactos de forma qualitativa, as detalhes quantitativas ainda são incógnitas ou apenas mantidas dentro do 'know-how' das empresas. Neste trabalho, de acordo com resultados experimentais baseado num plataforma de SoC1 industrial, tentamos quantificar os impactos derivados do uso de técnicas de redução de consumo de energia. Nos concentramos em relacionar o fator de redução de energia de cada técnica aos impactos em termo de área, desempenho, esforço de implementação e verificação. Na ausência desse tipo de dados, que relacionam o esforço de engenharia com as metas de consumo de energia, incertezas e atrasos serão frequentes no cronograma de projeto. Esperamos que este tipo de orientações possam ajudar/guiar os arquitetos de projeto em selecionar as técnicas adequadas para reduzir o consumo de energia dentro do alcance de orçamento e cronograma de projetoAbstract: The semiconductor industry has always faced strong demands to solve the problem of heat dissipation and reduce the power consumption in electronic devices. This trend has been increased in recent years with the action of environmental sustainability. The correct conception of an electronic system for low power consumption is an issue with multiple levels of complexities and requires systematic approaches in its construction. However, the adoption of any technique for reducing the power consumption is always linked with some specific goals and causes some impacts on the project. Although the designers know well that these impacts can affect the design in a quality aspect, the quantitative details are still unkown or just be kept inside the company's know-how. In this work, according to the experimental results based on an industrial SoC2 platform, we try to quantify the impacts of the use of low power techniques. We will relate the power reduction factor of each technique to the impact in terms of area, performance, implementation and verification effort. In the absence of such data, which relates the engineering effort to the goals of power consumption, uncertainties and delays are frequent. We hope that such guidelines can help/guide the project architects in selecting the appropriate techniques to reduce the power consumption within the limit of budget and project scheduleMestradoCiência da ComputaçãoMestre em Ciência da Computaçã

    An Extensible, scalable microprocessor architecture

    Get PDF
    An extensible, scalable stack-based microprocessor architecture is developed and discussed. Several unique features of the architecture, including its non-memory oriented interface, and its use of a stack for holding and executing code, are detailed. A programmed model is used to verify the architecture, and a hardware implementation of a small-scale version of the architecture is constructed and tested. Notes for future implementations are provides. Possible applications based on the latest technological trends are discussed, and topics for further research into the architecture are listed

    VLSI design concepts for iterative algorithms

    Get PDF
    Circuit design becomes more and more complicated, especially when the Very Large Scale Integration (VLSI) manufacturing technology node keeps shrinking down to nanoscale level. New challenges come up such as an increasing gap between the design productivity and the Moore’s Law. Leakage power becomes a major factor of the power consumption and traditional shared bus transmission is the critical bottleneck in the billion transistors Multi-Processor System–on–Chip (MPSoC) designs. These issues lead us to discuss the impact on the design of iterative algorithms. This thesis presents several strategies that satisfy various design con- straints, which can be used to explore superior solutions for the circuit design of iterative algorithms. Four selected examples of iterative al- gorithms are elaborated in this respect: hardware implementation of COordinate Rotation DIgital Computer (CORDIC) processor for sig- nal processing, configurable DCT and integer transformations based CORDIC algorithm for image/video compression, parallel Jacobi Eigen- value Decomposition (EVD) method with arbitrary iterations for com- munication, and acceleration of parallel Sparse Matrix–Vector Multipli- cation (SMVM) operations based Network–on–Chip (NoC) for solving systems of linear equations. These four applications of iterative meth- ods have been chosen since they cover a wide area of current signal processing tasks. Each method has its own unique design criteria when it comes to the direct implementation on the circuit level. Therefore, a balanced solution between various design tradeoffs is elaborated for each method. These tradeoffs are between throughput and power consumption, com- putational complexity and transformation accuracy, the number of in- ner/outer iterations and energy consumption, data structure and net- work topology. It is shown that all of these algorithms can be imple- mented on FPGA devices or as ASICs efficiently

    CMOS Sensors for Time-Resolved Active Imaging

    Full text link
    In the past decades, time-resolved imaging such as fluorescence lifetime or time-of-flight depth imaging has been extensively explored in biomedical and industrial fields because of its non-invasive characterization of material properties and remote sensing capability. Many studies have shown its potential and effectiveness in applications such as cancer detection and tissue diagnoses from fluorescence lifetime imaging, and gesture/motion sensing and geometry sensing from time-of-flight imaging. Nonetheless, time-resolved imaging has not been widely adopted due to the high cost of the system and performance limits. The research presented in this thesis focuses on the implementation of low-cost real-time time-resolved imaging systems. Two image sensing schemes are proposed and implemented to address the major limitations. First, we propose a single-shot fluorescence lifetime image sensors for high speed and high accuracy imaging. To achieve high accuracy, previous approaches repeat the measurement for multiple sampling, resulting in long measurement time. On the other hand, the proposed method achieves both high speed and accuracy at the same time by employing a pixel-level processor that takes and compresses the multiple samples within a single measurement time. The pixels in the sensor take multiple samples from the fluorescent optical signal in sub-nanosecond resolution and compute the average photon arrival time of the optical signal. Thanks to the multiple sampling of the signal, the measurement is insensitive to the shape or the pulse-width of excitation, providing better accuracy and pixel uniformity than conventional rapid lifetime determination (RLD) methods. The proposed single-shot image sensor also improves the imaging speed by orders of magnitude compared to other conventional center-of-mass methods (CMM). Second, we propose a 3-D camera with a background light suppression scheme which is adaptable to various lighting conditions. Previous 3-D cameras are not operable in outdoor conditions because they suffer from measurement errors and saturation problems under high background light illumination. We propose a reconfigurable architecture with column-parallel discrete-time background light cancellation circuit. Implementing the processor at the column level allows an order of magnitude reduction in pixel size as compared to existing pixel-level processors. The column-level approach also provides reconfigurable operation modes for optimal performance in all lighting conditions. For example, the sensor can operate at the best frame-rate and resolution without the presence of background light. If the background light saturates the sensor or increases the shot noise, the sensor can adjust the resolution and frame-rate by pixel binning and superresolution techniques. This effectively enhances the well capacity of the pixel to compensate for the increase shot noise, and speeds up the frame processing to handle the excessive background light. A fabricated prototype sensor can suppress the background light more than 100-klx while achieving a very small pixel size of 5.9μm.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/136950/1/eecho_1.pd

    Solving graph coloring and SAT problems using field programmable gate arrays.

    Get PDF
    Chu-Keung Chung.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 88-92).Abstracts in English and Chinese.Abstract --- p.iAcknowledgments --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation and Aims --- p.1Chapter 1.2 --- Contributions --- p.3Chapter 1.3 --- Structure of the Thesis --- p.4Chapter 2 --- Literature Review --- p.6Chapter 2.1 --- Introduction --- p.6Chapter 2.2 --- Complete Algorithms --- p.7Chapter 2.2.1 --- Parallel Checking --- p.7Chapter 2.2.2 --- Mom's --- p.8Chapter 2.2.3 --- Davis-Putnam --- p.9Chapter 2.2.4 --- Nonchronological Backtracking --- p.9Chapter 2.2.5 --- Iterative Logic Array (ILA) --- p.10Chapter 2.3 --- Incomplete Algorithms --- p.11Chapter 2.3.1 --- GENET --- p.11Chapter 2.3.2 --- GSAT --- p.12Chapter 2.4 --- Summary --- p.13Chapter 3 --- Algorithms --- p.14Chapter 3.1 --- Introduction --- p.14Chapter 3.2 --- Tree Search Techniques --- p.14Chapter 3.2.1 --- Depth First Search --- p.15Chapter 3.2.2 --- Forward Checking --- p.16Chapter 3.2.3 --- Davis-Putnam --- p.17Chapter 3.2.4 --- GRASP --- p.19Chapter 3.3 --- Incomplete Algorithms --- p.20Chapter 3.3.1 --- GENET --- p.20Chapter 3.3.2 --- GSAT Algorithm --- p.22Chapter 3.4 --- Summary --- p.23Chapter 4 --- Field Programmable Gate Arrays --- p.24Chapter 4.1 --- Introduction --- p.24Chapter 4.2 --- FPGA --- p.24Chapter 4.2.1 --- Xilinx 4000 series FPGAs --- p.26Chapter 4.2.2 --- Bitstream --- p.31Chapter 4.3 --- Giga Operations Reconfigurable Computing Platform --- p.32Chapter 4.4 --- Annapolis Wildforce PCI board --- p.33Chapter 4.5 --- Summary --- p.35Chapter 5 --- Implementation --- p.36Chapter 5.1 --- Parallel Graph Coloring Machine --- p.36Chapter 5.1.1 --- System Architecture --- p.38Chapter 5.1.2 --- Evaluator --- p.39Chapter 5.1.3 --- Finite State Machine (FSM) --- p.42Chapter 5.1.4 --- Memory --- p.43Chapter 5.1.5 --- Hardware Resources --- p.43Chapter 5.2 --- Serial Graph Coloring Machine --- p.44Chapter 5.2.1 --- System Architecture --- p.44Chapter 5.2.2 --- Input Memory --- p.46Chapter 5.2.3 --- Solution Store --- p.46Chapter 5.2.4 --- Constraint Memory --- p.47Chapter 5.2.5 --- Evaluator --- p.48Chapter 5.2.6 --- Input Mapper --- p.49Chapter 5.2.7 --- Output Memory --- p.49Chapter 5.2.8 --- Backtrack Checker --- p.50Chapter 5.2.9 --- Word Generator --- p.51Chapter 5.2.10 --- State Machine --- p.51Chapter 5.2.11 --- Hardware Resources --- p.54Chapter 5.3 --- Serial Boolean Satisfiability Solver --- p.56Chapter 5.3.1 --- System Architecture --- p.58Chapter 5.3.2 --- Solutions --- p.59Chapter 5.3.3 --- Solution Generator --- p.59Chapter 5.3.4 --- Evaluator --- p.60Chapter 5.3.5 --- AND/OR --- p.62Chapter 5.3.6 --- State Machine --- p.62Chapter 5.3.7 --- Hardware Resources --- p.64Chapter 5.4 --- GSAT Solver --- p.65Chapter 5.4.1 --- System Architecture --- p.65Chapter 5.4.2 --- Variable Memory --- p.65Chapter 5.4.3 --- Flip-Bit Vector --- p.66Chapter 5.4.4 --- Clause Evaluator --- p.67Chapter 5.4.5 --- Adder --- p.70Chapter 5.4.6 --- Random Bit Generator --- p.71Chapter 5.4.7 --- Comparator --- p.71Chapter 5.4.8 --- Sum Register --- p.71Chapter 5.5 --- Summary --- p.71Chapter 6 --- Results --- p.73Chapter 6.1 --- Introduction --- p.73Chapter 6.2 --- Parallel Graph Coloring Machine --- p.73Chapter 6.3 --- Serial Graph Coloring Machine --- p.74Chapter 6.4 --- Serial SAT Solver --- p.74Chapter 6.5 --- GSAT Solver --- p.75Chapter 6.6 --- Summary --- p.76Chapter 7 --- Conclusion --- p.77Chapter 7.1 --- Future Work --- p.78Chapter A --- Software Implementation of Graph Coloring in CHIP --- p.79Chapter B --- Density Improvements Using Xilinx RAM --- p.81Chapter C --- Bit stream Configuration --- p.83Bibliography --- p.88Publications --- p.9
    corecore