233 research outputs found
Agile SoC Development with Open ESP
ESP is an open-source research platform for heterogeneous SoC design. The
platform combines a modular tile-based architecture with a variety of
application-oriented flows for the design and optimization of accelerators. The
ESP architecture is highly scalable and strikes a balance between regularity
and specialization. The companion methodology raises the level of abstraction
to system-level design and enables an automated flow from software and hardware
development to full-system prototyping on FPGA. For application developers, ESP
offers domain-specific automated solutions to synthesize new accelerators for
their software and to map complex workloads onto the SoC architecture. For
hardware engineers, ESP offers automated solutions to integrate their
accelerator designs into the complete SoC. Conceived as a heterogeneous
integration platform and tested through years of teaching at Columbia
University, ESP supports the open-source hardware community by providing a
flexible platform for agile SoC development.Comment: Invited Paper at the 2020 International Conference On Computer Aided
Design (ICCAD) - Special Session on Opensource Tools and Platforms for Agile
Development of Specialized Architecture
IoT and Digitization Will Reconnect System Engineering and Science
The fully connected world is quickly becoming a reality. Architects and developers of this new world must understand both the hardware and software basics of IoT and IIoT systems as well as the proven way to deal with the complexities of the integration of sensors, processors, wireless connectivity, edge to cloud networks, data partitioning and processing, AI, machine language, digital threads and twins, and much more. Such complexity can only be handled with a systems-of-systems (SoS) engineering approach.
But while systems engineering may hold many of the solutions to IoT challenges, systems engineering must evolve from its traditional role. Some have even suggested that the data requirements and digitization of the IoT and corresponding digital threads are putting the engineering back into systems engineering via model-based designs. This will also help reconnect system engineering to system science.
This presentation will show how the IoT hardware and software technologies are changing the traditional systems engineering approach. Further, professionals that are so prepared with both the basics of IoT and systems engineering will stand a better chance of competing in the IoT space.https://pdxscholar.library.pdx.edu/systems_science_seminar_series/1078/thumbnail.jp
Recommended from our members
The Intellectual Property Business Model (IP-BM) - Lessons from ARM Plc.
Under what conditions can technology ventures design and implement a sustainable Intellectual Property Business Model? Many new firms in technology-intensive domains seek to adopt such models. This enables them to focus limited resources on core areas of competence, but raises significant challenges. These include the reluctance of customers to license the technology until it is generating value in applications and the experimentation needed to identify appropriate applications and markets, especially for generic technologies. Complementary assets must be fostered to transfer the technology across the value chain. Protecting the IP during scale up is problematic. In this paper we offer a detailed study of a firm that succeeded in overcoming these challenges, ARM plc. We attempt to identify what aspects of their experience (of IP generated growth) are generalizable to other firms. We define a business model as the way the firm is organized to create and capture value. We explore the kind of business ecosystem that must be nurtured for a firm to sustain growth through value creation and capture based on an intellectual property business model and find that reciprocity of benefits across the business ecosystem is needed
NPS: A Framework for Accurate Program Sampling Using Graph Neural Network
With the end of Moore's Law, there is a growing demand for rapid
architectural innovations in modern processors, such as RISC-V custom
extensions, to continue performance scaling. Program sampling is a crucial step
in microprocessor design, as it selects representative simulation points for
workload simulation. While SimPoint has been the de-facto approach for decades,
its limited expressiveness with Basic Block Vector (BBV) requires
time-consuming human tuning, often taking months, which impedes fast innovation
and agile hardware development. This paper introduces Neural Program Sampling
(NPS), a novel framework that learns execution embeddings using dynamic
snapshots of a Graph Neural Network. NPS deploys AssemblyNet for embedding
generation, leveraging an application's code structures and runtime states.
AssemblyNet serves as NPS's graph model and neural architecture, capturing a
program's behavior in aspects such as data computation, code path, and data
flow. AssemblyNet is trained with a data prefetch task that predicts
consecutive memory addresses.
In the experiments, NPS outperforms SimPoint by up to 63%, reducing the
average error by 38%. Additionally, NPS demonstrates strong robustness with
increased accuracy, reducing the expensive accuracy tuning overhead.
Furthermore, NPS shows higher accuracy and generality than the state-of-the-art
GNN approach in code behavior learning, enabling the generation of high-quality
execution embeddings
Power Profiling Model for RISC-V Core
The reduction of power consumption is considered to be a critical factor for efficient computation of microprocessors. Therefore, it is necessary to implement a power management system that is aware of the computational load of the CPU cores. To enable such power management, this project aims to develop a power profiling model for the RISC-V core. TheSyDeKick verification environment was used to develop the power profiling models. Additionally, Python-controlled mixed mode simulations of C-programs compiled for A-Core were conducted to obtain needed data for the power profiling of the digital circuitry. The proposed methodology could employ a time-varying power consumption profiling for the A-Core RISC-V microprocessor core which depends on software, voltage, and clock frequency. The results of this project allow for the creation of parameterized power profiles for the A-Core, which can contribute to more efficient and sustainable computing
Experimental Evaluation of On-Board Contact-Graph Routing Solutions for Future Nano-Satellite Constellations
Hardware processing performance and storage capability for nanosatellites have increased notably in recent years. Unfortunately, this progress is not observed at the same pace in transmission data rate, mostly limited by available power in reduced and constrained platforms. Thus, space-to-ground data transfer becomes the operations bottleneck of most modern space applications. As channel rates are approaching the Shannon limit, alternative solutions to manage the data transmission are on the spot. Among these, networked nano-satellite constellations can cooperatively offload data to neighboring nodes via frequent inter-satellite links (ISL) opportunities in order to augment the overall volume and reduce the end-to-end data delivery delay. Nevertheless, the computation of efficient multi-hop routes needs to consider not only present satellite and ground segments as nodes, but a non-trivial time dynamic evolution of the system dictated by orbital dynamics. Moreover, the process should properly model and rely on considerable amount of available information from node’s configuration and network status obtained from recent telemetry. Also, in most practical cases, the forwarding decision shall happen in orbit, where satellites can timely react to local or in-transit traffic demands. In this context, it is appealing to investigate on the applicability of adequate algorithmic routing approaches running on state-of-the-art nanosatellite on-board computers. In this work, we present the first implementation of Contact Graph Routing (CGR) algorithm developed by the Jet Propulsion Laboratory (JPL, NASA) for a nanosatellite on-board computer. We describe CGR, including a Dijkstra adaptation operating at its core as well as protocol aspects depicted in CCSDS Schedule-Aware Bundle Routing (SABR) recommended standard. Based on JPL’s Interplanetary Overlay Network (ION) software stack, we build a strong baseline to develop the first CGR implementation for a nano-satellites. We make our code available to the public and adapt it to the GomSpace toolchain in order to compile it for the NanoMind A712C on-board flight hardware based on a 32-bit ARM7 RISC CPU processor. Next, we evaluate its performance in terms of CPU execution time (Tick counts) and memory resources for increasingly complex satellite networks. Obtained metrics serve as compelling evidence of the polynomial scalability of the approach, matching the predicted theoretical behavior. Furthermore, we are able to determine that the evaluated hardware and implementation can cope with satellite networks of more than 120 nodes and 1200 contact opportunities
A Modular Platform for Adaptive Heterogeneous Many-Core Architectures
Multi-/many-core heterogeneous architectures are shaping current and upcoming generations of compute-centric platforms which are widely used starting from mobile and wearable devices to high-performance cloud computing servers. Heterogeneous many-core architectures sought to achieve an order of magnitude higher energy efficiency as well as computing performance scaling by replacing homogeneous and power-hungry general-purpose processors with multiple heterogeneous compute units supporting multiple core types and domain-specific accelerators. Drifting from homogeneous architectures to complex heterogeneous systems is heavily adopted by chip designers and the silicon industry for more than a decade. Recent silicon chips are based on a heterogeneous SoC which combines a scalable number of heterogeneous processing units from different types (e.g. CPU, GPU, custom
accelerator).
This shifting in computing paradigm is associated with several system-level design challenges related to the integration and communication between a highly scalable number of heterogeneous compute units as well as SoC peripherals and storage units. Moreover, the increasing design complexities make the production of heterogeneous SoC chips a monopoly for only big market players due to the increasing development and design costs. Accordingly, recent initiatives towards agile hardware development open-source tools and microarchitecture aim to democratize silicon chip production for academic and commercial usage.
Agile hardware development aims to reduce development costs by providing an ecosystem for open-source hardware microarchitectures and hardware design processes. Therefore, heterogeneous many-core development and customization will be relatively less complex and less time-consuming than conventional design process methods.
In order to provide a modular and agile many-core development approach, this dissertation proposes a development platform for heterogeneous and self-adaptive many-core architectures consisting of a scalable number of heterogeneous tiles that maintain design regularity features while supporting heterogeneity. The proposed platform hides the integration complexities
by supporting modular tile architectures for general-purpose processing cores
supporting multi-instruction set architectures (multi-ISAs) and custom hardware accelerators. By leveraging field-programmable-gate-arrays (FPGAs), the self-adaptive feature of the many-core platform can be achieved by using dynamic and partial reconfiguration (DPR) techniques.
This dissertation realizes the proposed modular and adaptive heterogeneous many-core platform through three main contributions. The first contribution proposes and realizes a many-core architecture for heterogeneous ISAs. It provides a modular and reusable tilebased architecture for several heterogeneous ISAs based on open-source RISC-V ISA. The modular tile-based architecture features a configurable number of processing cores with different RISC-V ISAs and different memory hierarchies.
To increase the level of heterogeneity to support the integration of custom hardware accelerators, a novel hybrid memory/accelerator tile architecture is developed and realized as the second contribution. The hybrid tile is a modular and reusable tile that can be configured at run-time to operate as a scratchpad shared memory between compute tiles or as an accelerator tile hosting a local hardware accelerator logic. The hybrid tile is designed and implemented to be seamlessly integrated into the proposed tile-based platform.
The third contribution deals with the self-adaptation features by providing a reconfiguration management approach to internally control the DPR process through processing cores (RISC-V based). The internal reconfiguration process relies on a novel DPR controller targeting FPGA design flow for RISC-V-based SoC to change the types and functionalities of compute tiles at run-time
Extending a modern RISC-V vector accelerator with direct access to the memory hierarchy through AMBA 5 CHI.
El BSC estĂ desenvolupant un accelerador vectorial desacoblat basat en RISC-V. A la versiĂł anterior d'aquest projecte, l'accelerador utilitza Open Vector Interface (OVI) per accedir a la memòria cache L2 compartida, a travĂ©s del nucli escalar del processador. Posteriorment, el nucli del processador accedeix a la memòria cache L2 compartida a travĂ©s del NoC. Aquest mecanisme d'accĂ©s a la memòria en dos nivells introdueix un augment considerable en la latència dels accessos. A mĂ©s, el temps d'accĂ©s a la memòria Ă©s fonamental per al rendiment de l'accelerador. Per atacar aquest problema, aquest projecte dissenya una IP que fa d'interfĂcie per a AMBA 5 CHI i que proporciona a l'accelerador accĂ©s directe al NoC, reduint aixĂ la latència dels accessos a memòria. Aquest projecte pretĂ©n obtenir un disseny funcional amb operacions bĂ siques per facilitar la fase d'integraciĂł. Per tant, el disseny no tĂ© restriccions ni d'Ă rea, ni d'energia. El disseny cobreix diferents aspectes del protocol AMBA 5 CHI, construeix l'arquitectura i proposa un conjunt de proves per verificar la funcionalitat del mòdul dissenyat. Els resultats finals mostren que aquesta IP pot proporcionar amb èxit a l'accelerador accĂ©s directe a la memòria cache L2 compartida, reemplaçant la interfĂcie OVI i millorant el rendiment. TambĂ© brindem suggeriments sobre com millorar encara mĂ©s la interfĂcie IP AMBA 5 CHI en termes de rendiment i funcionalitatsThe BSC is developing a decoupled RISC-V-based vector accelerator. In the previous version of this project, the vector accelerator uses the Open Vector Interface (OVI) to access the shared L2 cache, through a scalar processor core. Furthermore, the processor core accessed the shared L2 cache via the NoC. This two-level memory access mechanism introduces a significant latency overhead to the system. Moreover, memory access time is critical to the performance of the accelerator. To attack this problem, this project designs an AMBA 5 CHI interface IP that provides the accelerator with direct access to the NoC, thus reducing memory latency. This project aims to obtain a functional design with basic operations to facilitate the integration phase. Therefore, the interface IP has no specific area or power constraints. The IP design covers different AMBA 5 CHI protocol aspects, assembles the architecture, and proposes a set of tests to verify the functionality of the designed module. Final results show that this IP can successfully provide the accelerator with direct access to the shared L2 cache, replacing the OVI interface and improving performance. We also give pointers on how to further improve the AMBA 5 CHI interface IP in terms of performance and functionalities
The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework
Computers continue to diversify with respect to system designs, emerging
memory technologies, and application memory demands. Unfortunately, continually
adapting the conventional virtual memory framework to each possible system
configuration is challenging, and often results in performance loss or requires
non-trivial workarounds. To address these challenges, we propose a new virtual
memory framework, the Virtual Block Interface (VBI). We design VBI based on the
key idea that delegating memory management duties to hardware can reduce the
overheads and software complexity associated with virtual memory. VBI
introduces a set of variable-sized virtual blocks (VBs) to applications. Each
VB is a contiguous region of the globally-visible VBI address space, and an
application can allocate each semantically meaningful unit of information
(e.g., a data structure) in a separate VB. VBI decouples access protection from
memory allocation and address translation. While the OS controls which programs
have access to which VBs, dedicated hardware in the memory controller manages
the physical memory allocation and address translation of the VBs. This
approach enables several architectural optimizations to (1) efficiently and
flexibly cater to different and increasingly diverse system configurations, and
(2) eliminate key inefficiencies of conventional virtual memory. We demonstrate
the benefits of VBI with two important use cases: (1) reducing the overheads of
address translation (for both native execution and virtual machine
environments), as VBI reduces the number of translation requests and associated
memory accesses; and (2) two heterogeneous main memory architectures, where VBI
increases the effectiveness of managing fast memory regions. For both cases,
VBI significanttly improves performance over conventional virtual memory
Standardisation of Practices in Open Source Hardware
Standardisation is an important component in the maturation of any field of
technology. It contributes to the formation of a recognisable identity and
enables interactions with a wider community. This article reviews past and
current standardisation initiatives in the field of Open Source Hardware (OSH).
While early initiatives focused on aspects such as licencing, intellectual
property and documentation formats, recent efforts extend to ways for users to
exercise their rights under open licences and to keep OSH projects discoverable
and accessible online. We specifically introduce two standards that are
currently being released and call for early users and contributors, the DIN
SPEC 3105 and the Open Know How Manifest Specification. Finally, we reflect on
challenges around standardisation in the community and relevant areas for
future development such as an open tool chain, modularity and hardware specific
interface standards.Comment: 9 Pages without abstract and references (else 13), no figure
- …