425 research outputs found

    Internationalisation of Innovation: Why Chip Design Moving to Asia

    Get PDF
    This paper will appear in International Journal of Innovation Management, special issue in honor of Keith Pavitt, (Peter Augsdoerfer, Jonathan Sapsed, and James Utterback, guest editors), forthcoming. Among Keith Pavitt's many contributions to the study of innovation is the proposition that physical proximity is advantageous for innovative activities that involve highly complex technological knowledge But chip design, a process that creates the greatest value in the electronics industry and that requires highly complex knowledge, is experiencing a massive dispersion to leading Asian electronics exporting countries. To explain why chip design is moving to Asia, the paper draws on interviews with 60 companies and 15 research institutions that are doing leading-edge chip design in Asia. I demonstrate that "pull" and "policy" factors explain what attracts design to particular locations. But to get to the root causes that shift the balance in favor of geographical decentralization, I examine "push" factors, i.e. changes in design methodology ("system-on-chip design") and organization ("vertical specialization" within global design networks). The resultant increase in knowledge mobility explains why chip design - that, in Pavitt's framework is not supposed to move - is moving from the traditional centers to a few new specialized design clusters in Asia. A completely revised and updated version has been published as: " Complexity and Internationalisation of Innovation: Why is Chip Design Moving to Asia?," in International Journal of Innovation Management, special issue in honour of Keith Pavitt, Vol. 9,1: 47-73.

    Toward fast and accurate architecture exploration in a hardware/software codesign flow

    Get PDF

    How general-purpose can a GPU be?

    Get PDF
    The use of graphics processing units (GPUs) in general-purpose computation (GPGPU) is a growing field. GPU instruction sets, while implementing a graphics pipeline, draw from a range of single instruction multiple datastream (SIMD) architectures characteristic of the heyday of supercomputers. Yet only one of these SIMD instruction sets has been of application on a wide enough range of problems to survive the era when the full range of supercomputer design variants was being explored: vector instructions. Supercomputers covered a range of exotic designs such as hypercubes and the Connection Machine (Fox, 1989). The latter is likely the source of the snide comment by Cray: it had thousands of relatively low-speed CPUs (Tucker & Robertson, 1988). Since Cray won, why are we not basing our ideas on his designs (Cray Inc., 2004), rather than those of the losers? The Top 500 supercomputer list is dominated by general-purpose CPUs, and nothing like the Connection Machine that headed the list in 1993 still exists

    Ultra-Low-Power Processors

    Get PDF
    Society's increasing use of connected sensing and wearable computing has created robust demand for ultra-low-power (ULP) edge computing devices and associated system-on-chip (SoC) architectures. In fact, the ubiquity of ULP processing has already made such embedded devices the highest-volume processor part in production, with an even greater dominance expected in the near future. The Internet of Everything calls for an embedded processor in every object, necessitating billions or trillions of processors. At the same time, the explosion of data generated from these devices, in conjunction with the traditional model of using cloud-based services to process the data, will place tremendous demands on limited wireless spectrum and energy-hungry wireless networks. Smart, ULP edge devices are the only viable option that can meet these demands

    CMOS + stochastic nanomagnets: heterogeneous computers for probabilistic inference and learning

    Full text link
    Extending Moore's law by augmenting complementary-metal-oxide semiconductor (CMOS) transistors with emerging nanotechnologies (X) has become increasingly important. Accelerating Monte Carlo algorithms that rely on random sampling with such CMOS+X technologies could have significant impact on a large number of fields from probabilistic machine learning, optimization to quantum simulation. In this paper, we show the combination of stochastic magnetic tunnel junction (sMTJ)-based probabilistic bits (p-bits) with versatile Field Programmable Gate Arrays (FPGA) to design a CMOS + X (X = sMTJ) prototype. Our approach enables high-quality true randomness that is essential for Monte Carlo based probabilistic sampling and learning. Our heterogeneous computer successfully performs probabilistic inference and asynchronous Boltzmann learning, despite device-to-device variations in sMTJs. A comprehensive comparison using a CMOS predictive process design kit (PDK) reveals that compact sMTJ-based p-bits replace 10,000 transistors while dissipating two orders of magnitude of less energy (2 fJ per random bit), compared to digital CMOS p-bits. Scaled and integrated versions of our CMOS + stochastic nanomagnet approach can significantly advance probabilistic computing and its applications in various domains by providing massively parallel and truly random numbers with extremely high throughput and energy-efficiency

    Multi-core devices for safety-critical systems: a survey

    Get PDF
    Multi-core devices are envisioned to support the development of next-generation safety-critical systems, enabling the on-chip integration of functions of different criticality. This integration provides multiple system-level potential benefits such as cost, size, power, and weight reduction. However, safety certification becomes a challenge and several fundamental safety technical requirements must be addressed, such as temporal and spatial independence, reliability, and diagnostic coverage. This survey provides a categorization and overview at different device abstraction levels (nanoscale, component, and device) of selected key research contributions that support the compliance with these fundamental safety requirements.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness under grant TIN2015-65316-P, Basque Government under grant KK-2019-00035 and the HiPEAC Network of Excellence. The Spanish Ministry of Economy and Competitiveness has also partially supported Jaume Abella under Ramon y Cajal postdoctoral fellowship (RYC-2013-14717).Peer ReviewedPostprint (author's final draft

    Pakettiprosessointijärjestelmien Suorituskykyanalyysi

    Get PDF
    This thesis investigates the use of measurement, simulation, and modeling methods for the performance analysis of packet processing systems, and more precisely hardware accelerated multiprocessor system-on-chip (MPSoC) devices running task-parallel applications. To guarantee the tight latency and throughput requirements, the devices often incorporate complex hardware accelerated packet scheduling mechanisms. At the same time, due to the complexity of these systems, different software abstractions, such as task-based programming models, are used to develop packet processing applications. These challenges, together with dynamic characteristics of the packet streams makes the performance analysis of packet processing systems non-trivial. We demonstrate that, with extended queue disciplines and support for modeling parallelism, resource network methodology is a viable approach for modeling complex MPSoC based systems running task-based parallel applications on dynamic workloads. The main contributions of our work are three-fold. First, we have extended the toolset of an existing in-house modeling and simulation software, Performance Simulation Environment. The extensions enable modeling of user-definable queue disciplines, which further enable flexible modeling of complex hardware interactions of MPSoCs and the parallelism of task-based programming models. Secondly, we have studied, instrumented, and measured the characteristics of a packet processing system. Finally we have modeled a multi-blade packet processing system with customizable workload and task-parallel application models, and run simulation experiments. In both experiments, the model acts as expected. According to the experiment results, the resource network concept seems to be a viable tool for the performance analysis of packet processing systems. The chosen abstraction level provides desired balance between the functionality, ease of use, and simulation performance.Tässä työssä tutkitaan mittaus-, mallinnus-, ja simulaatiometodien käyttöä pakettiprosessisysteemien, tarkemmin ottaen tehtävärinnakkaisia sovelluksia ajavien laitteistokiihdytettyjen moniydinjärjestelmien, suorityskykyanalyysiin. Tiukoista viive- ja läpivirtausvaatimuksista johtue pakettiprosessointilaitteistot sisältävät usein monimutkaisia laitteistokiihdytettyjä pakettiajoitusmekanismeja. Laittestojen monimutkaisuudesta johtuen pakettiprosessointisovellusten kehittämiseen käytetään usein erilaisia ohjelmointiabstraktioita, kuten tehtävärinnakkaisia ohjelmointimalleja. Laitteston ja ohjelmiston asettamat haasteet yhdessä pakettivirtojen dynaamisen luonteen kanssa tekevät pakettiprosessointijärjestelmien suorituskykyanalyysista epätriviaalia. Työssä havainnollistamme, että laajennettujen jonokurien ja rinnakkaismallinnustuen avulla resurssiverkkometodologia on toimiva lähestymistapa tehtävärinnakkaisia rinnakkaisohjelmointisovelluksia ajavien monimutkaisten laitteistokiihdytettyjen moniydinjärjestelmien suorituskykyanalyysiin dynaamisilla työkuormilla. Työmme päätulokset ovat kolmiosaiset. Ensinnäkin, olemme laajentaneet olemassaolevan mallinnus- ja simulaatioohjelmiston, Performance Simulation Environmentin, ohjelmointityökaluja. Laajennukset mahdollistavat käyttäjän määriteltävien jonokurien mallintamisen, mikä edelleen mahdollistaa tehtävärinnakkaisia sovelluksia ajavien laittestokiihdytettyjen moniydinjärjestelmien laittestovuorovaikutusten joustavan mallinnuksen. Toiseksi, olemme tutkineet ja mitanneet erään pakettiprosessointijärjestelmän ominaisuuksia. Viimeiseksi, olemme mallintaneet pakettiprosessointijärjestelmän muunnettavilla työkuormilla ja tehtävärinnakkaisilla sovellusmalleilla, sekä suorittaneet näitä simulaatiokokein. Molempien kokeiden mallit käyttäytyvät odotetulla tavalla. Koetulosten perusteella resurssiverkkokonsepti vaikuttaa toimivalta työkalulta kompleksien pakettiprosessointijärjestelmien suorituskykyanalyysiin. Valittu abstraktiotaso tarjoaa toivotun tasapainon simulaation suorituskyvyn, toiminnallisuuden ja helppokäyttöisyyden välillä

    Research and Education in Computational Science and Engineering

    Get PDF
    Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers of all persuasions with algorithmic inventions and software systems that transcend disciplines and scales. Carried on a wave of digital technology, CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society; and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution that engulfs the planet, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. This report describes the rapid expansion of CSE and the challenges to sustaining its bold advances. The report also presents strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie
    corecore