Search CORE

307 research outputs found

Solving the global atmospheric equations through heterogeneous reconfigurable platforms

Author: Fu H
Gan L
Huang X
Luk W
Xue W
Yang C
Yang G
Zhang Y
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/03/2014
Field of study

Spiral - Imperial College Digital Repository

Proceedings of the 3rd Workshop on Domain-Specific Language Design and Implementation (DSLDI 2015)

Author: Erdweg Sebastian
van der Storm Tijs
Publication venue
Publication date: 01/08/2015
Field of study

The goal of the DSLDI workshop is to bring together researchers and practitioners interested in sharing ideas on how DSLs should be designed, implemented, supported by tools, and applied in realistic application contexts. We are both interested in discovering how already known domains such as graph processing or machine learning can be best supported by DSLs, but also in exploring new domains that could be targeted by DSLs. More generally, we are interested in building a community that can drive forward the development of modern DSLs. These informal post-proceedings contain the submitted talk abstracts to the 3rd DSLDI workshop (DSLDI'15), and a summary of the panel discussion on Language Composition

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Assessing the Suitability of King Topologies for Interconnection Networks

Author: Beivide Palacio Ramón
Bosque Orero José Luis
Camarero Coterillo Cristobal
Castillo Villar Emilio
Martínez Fernández María del Carmen
Stafford Fernández Esteban
Vallejo Alonso Fernando
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2016
Field of study

In the late years many different interconnection networks have been used with two main tendencies. One is characterized by the use of high-degree routers with long wires while the other uses routers of much smaller degree. The latter rely on two-dimensional mesh and torus topologies with shorter local links. This paper focuses on doubling the degree of common 2D meshes and tori while still preserving an attractive layout for VLSI design. By adding a set of diagonal links in one direction, diagonal networks are obtained. By adding a second set of links, networks of degree eight are built, named king networks. This research presents a comprehensive study of these networks which includes a topological analysis, the proposal of appropriate routing procedures and an empirical evaluation. King networks exhibit a number of attractive characteristics which translate to reduced execution times of parallel applications. For example, the execution times NPB suite are reduced up to a 30 percent. In addition, this work reveals other properties of king networks such as perfect partitioning that deserves further attention for its convenient exploitation in forthcoming high-performance parallel systems

UCrea

Center for Aeronautics and Space Information Sciences

Author: Flynn Michael J.
Publication venue
Publication date
Field of study

This report summarizes the research done during 1991/92 under the Center for Aeronautics and Space Information Science (CASIS) program. The topics covered are computer architecture, networking, and neural nets

NASA Technical Reports Server

Proceedings of the 3rd Workshop on Domain-Specific Language Design and Implementation (DSLDI'15)

Author
Publication venue
Publication date: 01/08/2015
Field of study

CWI's Institutional Repository

Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

Author: Becker Jürgen
Hübner Michael
Lagadec Loïc
Sander Oliver
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2010
Field of study

ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

KITopen

Predicting power scalability in a reconfigurable platform

Author: Beckett P
Publication venue: RMIT University
Publication date: 01/01/2007
Field of study

This thesis focuses on the evolution of digital hardware systems. A reconfigurable platform is proposed and analysed based on thin-body, fully-depleted silicon-on-insulator Schottky-barrier transistors with metal gates and silicide source/drain (TBFDSBSOI). These offer the potential for simplified processing that will allow them to reach ultimate nanoscale gate dimensions. Technology CAD was used to show that the threshold voltage in TBFDSBSOI devices will be controllable by gate potentials that scale down with the channel dimensions while remaining within appropriate gate reliability limits. SPICE simulations determined that the magnitude of the threshold shift predicted by TCAD software would be sufficient to control the logic configuration of a simple, regular array of these TBFDSBSOI transistors as well as to constrain its overall subthreshold power growth. Using these devices, a reconfigurable platform is proposed based on a regular 6-input, 6-output NOR LUT block in which the logic and configuration functions of the array are mapped onto separate gates of the double-gate device. A new analytic model of the relationship between power (P), area (A) and performance (T) has been developed based on a simple VLSI complexity metric of the form ATσ = constant. As σ defines the performance “return” gained as a result of an increase in area, it also represents a bound on the architectural options available in power-scalable digital systems. This analytic model was used to determine that simple computing functions mapped to the reconfigurable platform will exhibit continuous power-area-performance scaling behavior. A number of simple arithmetic circuits were mapped to the array and their delay and subthreshold leakage analysed over a representative range of supply and threshold voltages, thus determining a worse-case range for the device/circuit-level parameters of the model. Finally, an architectural simulation was built in VHDL-AMS. The frequency scaling described by σ, combined with the device/circuit-level parameters predicts the overall power and performance scaling of parallel architectures mapped to the array

RMIT Research Repository

Parallel Architectures and Parallel Algorithms for Integrated Vision Systems

Author: Choudhary Alok Nidhi
Publication venue
Publication date
Field of study

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems

NASA Technical Reports Server

Symmetry-based decomposition for optimised parallelisation in 3D printing processes

Author: Hatton Hayley
Khalid Muhammad
Manzoor Umar
Murray John
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/06/2023
Field of study

Current research in 3D printing focuses on improving printing performance through various techniques, including decomposition, but targets only single printers. With improved hardware costs increasing printer availability, more situations can arise involving a multitude of printers, which offers substantially more throughput in combination that may not be best utilised by current decomposition approaches. A novel approach to 3D printing is introduced that attempts to exploit this as a means of significantly increasing the speed of printing models. This was approached as a problem akin to the parallel delegation of computation tasks in a multi-core environment, where optimal performance involves computation load being distributed as evenly as possible. To achieve this, a decomposition framework was designed that combines recursive symmetric slicing with a hybrid tree-based analytical and greedy strategy to optimally minimise the maximum volume of subparts assigned to the set of printers. Experimental evaluation of the algorithm was performed to compare our approach to printing models normally (“in serial”) as a control. The algorithm was subjected to a range of models and a varying quantity of printers in parallel, with printer parameters held constant, and yielded mixed results. Larger, simpler, and more symmetric objects exhibited more significant and reliable improvements in fabrication duration at larger amounts of parallelisation than smaller, more complex, or more asymmetric objects

Aston Publications Explorer

High-performance and hardware-aware computing: proceedings of the second International Workshop on New Frontiers in High-performance and Hardware-aware Computing (HipHaC\u2711), San Antonio, Texas, USA, February 2011 ; (in conjunction with HPCA-17)

Author: Buchty Rainer
Weiß Jan-Philipp
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2011
Field of study

High-performance system architectures are increasingly exploiting heterogeneity. The HipHaC workshop aims at combining new aspects of parallel, heterogeneous, and reconfigurable microprocessor technologies with concepts of high-performance computing and, particularly, numerical solution methods. Compute- and memory-intensive applications can only benefit from the full hardware potential if all features on all levels are taken into account in a holistic approach

KITopen