141 research outputs found

    Exploiting Hardware Abstraction for Parallel Programming Framework: Platform and Multitasking

    Get PDF
    With the help of the parallelism provided by the fine-grained architecture, hardware accelerators on Field Programmable Gate Arrays (FPGAs) can significantly improve the performance of many applications. However, designers are required to have excellent hardware programming skills and unique optimization techniques to explore the potential of FPGA resources fully. Intermediate frameworks above hardware circuits are proposed to improve either performance or productivity by leveraging parallel programming models beyond the multi-core era. In this work, we propose the PolyPC (Polymorphic Parallel Computing) framework, which targets enhancing productivity without losing performance. It helps designers develop parallelized applications and implement them on FPGAs. The PolyPC framework implements a custom hardware platform, on which programs written in an OpenCL-like programming model can launch. Additionally, the PolyPC framework extends vendor-provided tools to provide a complete development environment including intermediate software framework, and automatic system builders. Designers\u27 programs can be either synthesized as hardware processing elements (PEs) or compiled to executable files running on software PEs. Benefiting from nontrivial features of re-loadable PEs, and independent group-level schedulers, the multitasking is enabled for both software and hardware PEs to improve the efficiency of utilizing hardware resources. The PolyPC framework is evaluated regarding performance, area efficiency, and multitasking. The results show a maximum 66 times speedup over a dual-core ARM processor and 1043 times speedup over a high-performance MicroBlaze with 125 times of area efficiency. It delivers a significant improvement in response time to high-priority tasks with the priority-aware scheduling. Overheads of multitasking are evaluated to analyze trade-offs. With the help of the design flow, the OpenCL application programs are converted into executables through the front-end source-to-source transformation and back-end synthesis/compilation to run on PEs, and the framework is generated from users\u27 specifications

    Dynamic Systolization for Developing Multiprocessor Supercomputers

    Get PDF
    A dynamic network approach is introduced for developing reconfigurable, systolic arrays or wavefront processors; This allows one to design very powerful and flexible processors to be used in a general-purpose, reconfigurable, and fault-tolerant, multiprocessor computer system. The concepts of macro-dataflow and multitasking can be integrated to handle variable-resolution granularities in computationally intensive algorithms. A multiprocessor architecture, Remps, is proposed based on these design methodologies. The Remps architecture is generalized from the Cedar, HEP, Cray X- MP, Trac, NYU ultracomputer, S-l, Pumps, Chip, and SAM projects. Our goal is to provide a multiprocessor research model for developing design methodologies, multiprocessing and multitasking supports, dynamic systolic/wavefront array processors, interconnection networks, reconfiguration techniques, and performance analysis tools. These system design and operational techniques should be useful to those who are developing or evaluating multiprocessor supercomputers

    Dynamic Scheduling, Allocation, and Compaction Scheme for Real-Time Tasks on FPGAs

    Get PDF
    Run-time reconfiguration (RTR) is a method of computing on reconfigurable logic, typically FPGAs, changing hardware configurations from phase to phase of a computation at run-time. Recent research has expanded from a focus on a single application at a time to encompass a view of the reconfigurable logic as a resource shared among multiple applications or users. In real-time system design, task deadlines play an important role. Real-time multi-tasking systems not only need to support sharing of the resources in space, but also need to guarantee execution of the tasks. At the operating system level, sharing logic gates, wires, and I/O pins among multiple tasks needs to be managed. From the high level standpoint, access to the resources needs to be scheduled according to task deadlines. This thesis describes a task allocator for scheduling, placing, and compacting tasks on a shared FPGA under real-time constraints. Our consideration of task deadlines is novel in the setting of handling multiple simultaneous tasks in RTR. Software simulations have been conducted to evaluate the performance of the proposed scheme. The results indicate significant improvement by decreasing the number of tasks rejected

    Parallel and Distributed Computing

    Get PDF
    The 14 chapters presented in this book cover a wide variety of representative works ranging from hardware design to application development. Particularly, the topics that are addressed are programmable and reconfigurable devices and systems, dependability of GPUs (General Purpose Units), network topologies, cache coherence protocols, resource allocation, scheduling algorithms, peertopeer networks, largescale network simulation, and parallel routines and algorithms. In this way, the articles included in this book constitute an excellent reference for engineers and researchers who have particular interests in each of these topics in parallel and distributed computing

    Reconfiguration of field programmable logic in embedded systems

    Get PDF

    A Field Programmable Gate Array Architecture for Two-Dimensional Partial Reconfiguration

    Get PDF
    Reconfigurable machines can accelerate many applications by adapting to their needs through hardware reconfiguration. Partial reconfiguration allows the reconfiguration of a portion of a chip while the rest of the chip is busy working on tasks. Operating system models have been proposed for partially reconfigurable machines to handle the scheduling and placement of tasks. They are called OS4RC in this dissertation. The main goal of this research is to address some problems that come from the gap between OS4RC and existing chip architectures and the gap between OS4RC models and practical applications. Some existing OS4RC models are based on an impractical assumption that there is no data exchange channel between IP (Intellectual Property) circuits residing on a Field Programmable Gate Array (FPGA) chip and between an IP circuit and FPGA I/O pins. For models that do not have such an assumption, their inter-IP communication channels have severe drawbacks. Those channels do not work well with 2-D partial reconfiguration. They are not suitable for intensive data stream processing. And frequently they are very complicated to design and very expensive. To address these problems, a new chip architecture that can better support inter-IP and IP-I/O communication is proposed and a corresponding OS4RC kernel is then specified. The proposed FPGA architecture is based on an array of clusters of configurable logic blocks, with each cluster serving as a partial reconfiguration unit, and a mesh of segmented buses that provides inter-IP and IP-I/O communication channels. The proposed OS4RC kernel takes care of the scheduling, placement, and routing of circuits under the constraints of the proposed architecture. Features of the new architecture in turns reduce the kernel execution times and enable the runtime scheduling, placement and routing. The area cost and the configuration memory size of the new chip architecture are calculated and analyzed. And the efficiency of the OS4RC kernel is evaluated via simulation using three different task models

    Multipurpose self-configuration of programmable photonic circuits

    Full text link
    [EN] Programmable integrated photonic circuits have been called upon to lead a new revolution in information systems by teaming up with high speed digital electronics and in this way, adding unique complementary features supported by their ability to provide bandwidthunconstrained analog signal processing. Relying on a common hardware implemented by two-dimensional integrated photonic waveguide meshes, they can provide multiple functionalities by suitable programming of their control signals. Scalability, which is essential for increasing functional complexity and integration density, is currently limited by the need to precisely control and configure several hundreds of variables and simultaneously manage multiple configuration actions. Here we propose and experimentally demonstrate two different approaches towards management automation in programmable integrated photonic circuits. These enable the simultaneous handling of circuit self-characterization, auto-routing, self-configuration and optimization. By combining computational optimization and photonics, this work takes an important step towards the realization of high-density and complex integrated programmable photonics.D.P.L. acknowledges funding through the Spanish MINECO Juan de la Cierva program. J.C. acknowledges funding from the ERC Advanced Grant ERC-ADG-2016-741415 UMWP-Chip and ERC-2019-POC-859927. Authors also acknowledge funding from Future MWP technologies and applications PROMETEO/2017/103, Advanced Instrumentation for World Class Microwave Photonics Research IDIFEDER/2018/031, EUIMWP CA16220, Infraestructura para caracterizacion de Chips Fotonicos EQC2018-004683-P.Pérez-López, D.; López-Hernández, A.; Dasmahapatra, P.; Capmany Francoy, J. (2020). Multipurpose self-configuration of programmable photonic circuits. Nature Communications. 11(1):1-11. https://doi.org/10.1038/s41467-020-19608-w111111Chrostowski, L. & Hochberg, M. Silicon Photonics Design (Cambridge University Press, 2015).Lin, Y. et al. Characterization of hybrid InP-TriPleX photonic integrated tunable lasers based on silicon nitride (Si 3N4/SiO2) microring resonators for optical coherent system. IEEE Photonics J. 10, 1400108 (2018).Bogaerts, W. et al. Proc. Integrated Design for Integrated Photonics: from the Physical to the Circuit Level and Back (SPIE Optics and Optoelectronics, Prague, Czech Republic, 2013).Inniss, D. & Rubenstein, R. Silicon Photonics: Fueling the Next Information Revolution (Elsevier Science, 2016).Streshinsky, M. et al. The road to affordable, large-scale silicon photonics. Opt. Photonics News 24, 32–39, (2013).Carrol, L. et al. Photonic packaging: transforming silicon photonic integrated circuits into photonic devices. Appl. Sci. 6, 426 (2016).Capmany, J. & Pérez, D. Programmable Integrated Photonics (Oxford University Press, 2019).Lyke, J. et al. An introduction to reconfigurable systems. Proc. IEEE 103, 291–317 (2015).Capmany, J., Gasulla, I. & Pérez, D. The programmable processor. Nat. Photonics 10, 6–8 (2015).Carolan, J. et al. Universal linear optics. Science 349, 711 (2015).Ribeiro, A. et al. Demonstration of a 4×4-port universal linear circuit. Optica 3, 1348–1357 (2016).Annoni, A. Unscrambling light—automatically undoing strong mixing between modes. Light Sci. Appl. 6, e17110 (2017).Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photonics 11, 441–446 (2017).Mennea, P. L. et al. Modular linear optical circuits. Optica 5, 1087–1090 (2018).Zheng, D. et al. Low-loss broadband 5×5 non-blocking Si3N4 optical switch matrix. Opt. Lett. 44, 2629–2632 (2019).Zhuang, L. et al. Programmable photonic signal processor chip for radiofrequency applications. Optica 2, 854–859 (2015).Pérez, D. et al. Multipurpose silicon photonics signal processor core. Nat. Commun. 8, 636 (2017).Zhang, W. & Yao, J. Photonic integrated field-programmable disk array signal processor. Nat. Commun. 11, 406 (2020).Eberhart, J. K. R. A new optimizer using particle swarm theory. In MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science (IEEE, Nagoya, Japan, 1995).Whitley, D. A genetic algorithm tutorial. Stat. Comput. 4, 65–85 (1994).Zibar, D., Wymeersch, H. & Lyubomirsky, I. Machine Learning under the spotlight. Nat. Photonics 11, 749–751 (2017).Pérez, D. Programmable integrated silicon photonics waveguide meshes: optimized designs and control algorithms. In IEEE Journal of Selected Topics in Quantum Electronics, Vol. 26 (IEEE, 2019).Pérez, D., Gasulla, I. & Capmany, J. Field-programmable photonic arrays. Opt. Express 26, 27265–27278 (2018).Pérez, D., Gasulla, I., Soref, R. & Capmany, J. Reconfigurable lattice mesh designs for programmable photonic processors. Opt. Express 24, 12093–12106 (2016).Pérez-López, D., Sánchez, E. & Capmany, Y. J. Programmable true time delay lines using integrated waveguide meshes. J. Lightwave Technol. 36, 4591–4601 (2018).López, A. et al. Auto-routing algorithm for field-programmable photonic gate arrays. Opt. Express 28, 737–752 (2020).Chen, X. & Boggaerts, W. A graph-based design and programming strategy for reconfigurable photonic circuits. In IEEE Photonics Society Summer Topical Meeting Series (SUM) (IEEE, Fort Lauderdale, FL, USA, 2019).Pérez, D., López, A., DasMahapatra, P. & Capmany, J. Field-Programmable Photonic Array for multipurpose microwave photonic applications. In IEEE International Topical Meeting on Microwave Photonics (MWP) (IEEE, Ottawa, Canada, 2019).Pérez, D. & Capmany, J. Scalable analysis for arbitrary photonic integrated waveguide meshes. Optica 6, 19–27 (2019).Yegnanarayanan, S. et al. Automated initialization of reconfigurable silicon-nitride (SiNx) filters. In Conference on Lasers and Electro-Optics (IEEE, San José, CA, 2018).Milanizadeh, M. et al. Cancelling thermal cross-talk effects in photonic integrated circuits. J. Light. Tech. 37, 1325–1332 (2019).Xie, Y., Zhuang, L. & Lowery, A. J. Picosecond optical pulse processing using a terahertz-bandwidth reconfigurable photonic integrated circuit. Nanophotonics 7, 837–852 (2018).Guan, B. et al. CMOS compatible reconfigurable silicon photonic lattice filters using cascaded unit cells for RF-photonic processing. IEEE J. Sel. Top. Quantum Electron. 20, 359–368 (2014).Doylend, J. K. et al. Hybrid III/V silicon photonic source with integrated 1D free-space beam steering. Opt. Lett. 37, 4257–4259 (2012).Burla, M. Advanced integrated optical beam forming networks for broadband phased array antenna systems, Telecommunication Engineering Faculty of Electrical Engineering, Mathematics and Computer Science. PhD. Thesis, University of Twente (2013).Wang, J. et al. Reconfigurable radio-frequency arbitrary waveforms synthesized in a silicon photonic chip, Nat. Commun. 6, 5957 (2015).Dumais, P. et al. Silicon photonic switch subsystem with 900 monolithically integrated calibration photodiodes and 64-fiber package. J. Lightwave Technol. 36, 233–238 (2018).Tanizawa, K. et al. 32×32 strictly non-blocking Si-wire optical switch on ultra-small die of 11×25 mm2. In Optical Fiber Communications Conference (IEEE, Los Angeles, CA, USA, 2015).Miller, D. A. B. Perfect optics with imperfect components. Optica 2, 747–750 (2015).Gazman, A. et al. Tapless and topology agnostic calibration solution for silicon photonic switches. Opt. Express 26, 347241 (2018).Cheng, Q. et al. First demonstration of automated control and assessment of a dynamically reconfigured monolithic 8 × 8 wavelength-and-space switch. IEEE J. Opt. Commun. Netw. 7, 388–395 (2015).Tait, A. N. et al. Continuous calibration of microring weights for analog optical networks. IEEE Photonics Technol. Lett. 28, 887–890 (2016).Carolan, J. et al. Scalable feedback control of single photon sources for photonic quantum technologies. Optica 6, 335–341 (2019).Tait, A. N. et al. Multi-channel control for microring weightbanks. Opt. Express 24, 8895 (2016).Jiang, H. et al. Chip-based arbitrary radio-frequency photonic filter with algorithm-driven reconfigurable resolution. Opt. Lett. 43, 415–418 (2018).Jayatilleka, H. Automatic configuration and wavelength locking of coupled silicon ring resonators. J. Lightwave Technol. 36, 210–218 (2018).Choo, G. Automatic monitor-based tuning of reconfigurable silicon photonic APF-based pole/zero filters. J. Lightwave Technol. 36, 1899–1911 (2018).Choo, G. Automatic monitor-based tuning of an RF silicon photonic 1X4 asymmetric binary tree true-time-delay beamforming network. J. Lightwave Technol. 36, 5263–5275 (2018).Bin Mohd Zain, M. Z. et al. A multi-objective particle swarm optimization algorithm based on dynamic boundary search for constrained optimization. Appl. Soft Comput. 70, 680–700 (2018).Pérez, D. et al. Thermal tuners on a silicon nitride platform. Preprint at https://arxiv.org/abs/1604.02958 (2016)

    Software support for dynamic partial reconfigurable FPGAs on heterogeneous platforms

    Get PDF
    This thesis addresses the design and implementation of a software support for real-time systems developed on heterogeneous platforms that include a processor and an FPGA with dynamic partial reconfiguration capabilities. The software support enables tasks to request the execution of accelerated functions on the FPGA in parallel with other tasks running on the processor. Accelerated functions are dynamically allocated on the FPGA depending of the availability of the area and the online requests issued by the processor, so extending the concept of multitasking to the FPGA resource domain. The performance of the allocation mechanism has been evaluated in terms of speed-up and response times. The achieved results show that the system is able to guarantee bounded delays and acceptable overhead that can be taken into account for a future schedulability analysis of real-time applications
    corecore