400,628 research outputs found

    Development of Variant of Software Architecture Implementation for Low-power General Purpose Microcontrollers by Finite State Machines

    Get PDF
    As a result of the research, two directions for development of software architecture for low-power general purpose microcontrollers (LPGPM) are identified. The first, classical approach is the development using standard State patterns. The second is the development of programs, algorithms and structures based on mathematical analysis.The first direction is chosen in the work. The variant of the implementation of a typical pattern for development of software architecture (SA) in the form of a finite state machine (FSM) is proposed to discussion. This pattern allows to divide the development of the architectural part of the program for LPGPM and programming the LPGPM hardware. This approach makes it possible to divide the work of the software architect and the work of LPGPM hardware specialists. Advantage of the solution in comparison with the real time operating system (RTOS) is the saving of LPGPM hardware resources. In addition, it improves the readability of code and good testing prospects. The resulting architecture makes it possible to easily accompany the software and switch to other types of microcontroller. The disadvantage is an increase in the required amount of RAM with an increase in the number of states. It is this disadvantage that requires the application not only of experimental and engineering-intuitive methods, but also to continue research in the second direction

    A software-hardware hybrid steering mechanism for clustered microarchitectures

    Get PDF
    Clustered microarchitectures provide a promising paradigm to solve or alleviate the problems of increasing microprocessor complexity and wire delays. High- performance out-of-order processors rely on hardware-only steering mechanisms to achieve balanced workload distribution among clusters. However, the additional steering logic results in a significant increase on complexity, which actually decreases the benefits of the clustered design. In this paper, we address this complexity issue and present a novel software-hardware hybrid steering mechanism for out-of-order processors. The proposed software- hardware cooperative scheme makes use of the concept of virtual clusters. Instructions are distributed to virtual clusters at compile time using static properties of the program such as data dependences. Then, at runtime, virtual clusters are mapped into physical clusters by considering workload information. Experiments using SPEC CPU2000 benchmarks show that our hybrid approach can achieve almost the same performance as a state-of-the-art hardware-only steering scheme, while requiring low hardware complexity. In addition, the proposed mechanism outperforms state-of-the-art software-only steering mechanisms by 5% and 10% on average for 2-cluster and 4-cluster machines, respectively.Peer ReviewedPostprint (published version

    Using fast and accurate simulation to explore hardware/software trade-offs in the multi-core era

    Get PDF
    Writing well-performing parallel programs is challenging in the multi-core processor era. In addition to achieving good per-thread performance, which in itself is a balancing act between instruction-level parallelism, pipeline effects and good memory performance, multi-threaded programs complicate matters even further. These programs require synchronization, and are affected by the interactions between threads through sharing of both processor resources and the cache hierarchy. At the Intel Exascience Lab, we are developing an architectural simulator called Sniper for simulating future exascale-era multi-core processors. Its goal is twofold: Sniper should assist hardware designers to make design decisions, while simultaneously providing software designers with a tool to gain insight into the behavior of their algorithms and allow for optimization. By taking architectural features into account, our simulator can provide more insight into parallel programs than what can be obtained from existing performance analysis tools. This unique combination of hardware simulator and software performance analysis tool makes Sniper a useful tool for a simultaneous exploration of the hardware and software design space for future high-performance multi-core systems

    A FPGA system for QRS complex detection based on Integer Wavelet Transform

    Get PDF
    Due to complexity of their mathematical computation, many QRS detectors are implemented in software and cannot operate in real time. The paper presents a real-time hardware based solution for this task. To filter ECG signal and to extract QRS complex it employs the Integer Wavelet Transform. The system includes several components and is incorporated in a single FPGA chip what makes it suitable for direct embedding in medical instruments or wearable health care devices. It has sufficient accuracy (about 95%), showing remarkable noise immunity and low cost. Additionally, each system component is composed of several identical blocks/cells what makes the design highly generic. The capacity of today existing FPGAs allows even dozens of detectors to be placed in a single chip. After the theoretical introduction of wavelets and the review of their application in QRS detection, it will be shown how some basic wavelets can be optimized for easy hardware implementation. For this purpose the migration to the integer arithmetic and additional simplifications in calculations has to be done. Further, the system architecture will be presented with the demonstrations in both, software simulation and real testing. At the end, the working performances and preliminary results will be outlined and discussed. The same principle can be applied with other signals where the hardware implementation of wavelet transform can be of benefit

    A multiarchitecture parallel-processing development environment

    Get PDF
    A description is given of the hardware and software of a multiprocessor test bed - the second generation Hypercluster system. The Hypercluster architecture consists of a standard hypercube distributed-memory topology, with multiprocessor shared-memory nodes. By using standard, off-the-shelf hardware, the system can be upgraded to use rapidly improving computer technology. The Hypercluster's multiarchitecture nature makes it suitable for researching parallel algorithms in computational field simulation applications (e.g., computational fluid dynamics). The dedicated test-bed environment of the Hypercluster and its custom-built software allows experiments with various parallel-processing concepts such as message passing algorithms, debugging tools, and computational 'steering'. Such research would be difficult, if not impossible, to achieve on shared, commercial systems

    Evaluating Rapid Application Development with Python for Heterogeneous Processor-based FPGAs

    Full text link
    As modern FPGAs evolve to include more het- erogeneous processing elements, such as ARM cores, it makes sense to consider these devices as processors first and FPGA accelerators second. As such, the conventional FPGA develop- ment environment must also adapt to support more software- like programming functionality. While high-level synthesis tools can help reduce FPGA development time, there still remains a large expertise gap in order to realize highly performing implementations. At a system-level the skill set necessary to integrate multiple custom IP hardware cores, interconnects, memory interfaces, and now heterogeneous processing elements is complex. Rather than drive FPGA development from the hardware up, we consider the impact of leveraging Python to ac- celerate application development. Python offers highly optimized libraries from an incredibly large developer community, yet is limited to the performance of the hardware system. In this work we evaluate the impact of using PYNQ, a Python development environment for application development on the Xilinx Zynq devices, the performance implications, and bottlenecks associated with it. We compare our results against existing C-based and hand-coded implementations to better understand if Python can be the glue that binds together software and hardware developers.Comment: To appear in 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM'17
    • …
    corecore