Search CORE

1,624 research outputs found

A virtual workbench for electric Lego labkits

Author: Grabeklis Jay M. (Jay Michael)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1996
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.by Jay M. Grabeklis.M.Eng

DSpace@MIT

Clock routing for high performance microprocessor designs.

Author
Publication venue
Publication date: 01/01/2011
Field of study

Tian, Haitong.Chinese abstract is on unnumbered page.Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.Includes bibliographical references (p. 65-74).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivations --- p.1Chapter 1.2 --- Our Contributions --- p.2Chapter 1.3 --- Organization of the Thesis --- p.3Chapter 2 --- Background Study --- p.4Chapter 2.1 --- Traditional Clock Routing Problem --- p.4Chapter 2.2 --- Tree-Based Clock Routing Algorithms --- p.5Chapter 2.2.1 --- Clock Routing Using H-tree --- p.5Chapter 2.2.2 --- Method of Means and Medians(MMM) --- p.6Chapter 2.2.3 --- Geometric Matching Algorithm (GMA) --- p.8Chapter 2.2.4 --- Exact Zero-Skew Algorithm --- p.9Chapter 2.2.5 --- Deferred Merge Embedding (DME) --- p.10Chapter 2.2.6 --- Boundary Merging and Embedding (BME) Algorithm --- p.14Chapter 2.2.7 --- Planar Clock Routing Algorithm --- p.17Chapter 2.2.8 --- Useful-skew Tree Algorithm --- p.18Chapter 2.3 --- Non-Tree Clock Distribution Networks --- p.19Chapter 2.3.1 --- Grid (Mesh) Structure --- p.20Chapter 2.3.2 --- Spine Structure --- p.20Chapter 2.3.3 --- Hybrid Structure --- p.21Chapter 2.4 --- Post-grid Clock Routing Problem --- p.22Chapter 2.5 --- Limitations of the Previous Work --- p.24Chapter 3 --- Post-Grid Clock Routing Problem --- p.26Chapter 3.1 --- Introduction --- p.26Chapter 3.2 --- Problem Definition --- p.27Chapter 3.3 --- Our Approach --- p.30Chapter 3.3.1 --- Delay-driven Path Expansion Algorithm --- p.31Chapter 3.3.2 --- Pre-processing to Connect Critical ports --- p.34Chapter 3.3.3 --- Post-processing to Reduce Capacitance --- p.36Chapter 3.4 --- Experimental Results --- p.39Chapter 3.4.1 --- Experiment Setup --- p.39Chapter 3.4.2 --- Validations of the Delay and Slew Estimation --- p.39Chapter 3.4.3 --- Comparisons with the Tree Grow (TG) Approach --- p.41Chapter 3.4.4 --- Lowest Achievable Delays --- p.42Chapter 3.4.5 --- Simulation Results --- p.42Chapter 4 --- Non-tree Based Post-Grid Clock Routing Problem --- p.44Chapter 4.1 --- Introduction --- p.44Chapter 4.2 --- Handling Ports with Large Load Capacitances --- p.46Chapter 4.2.1 --- Problem Ports Identification --- p.47Chapter 4.2.2 --- Non-Tree Construction --- p.47Chapter 4.2.3 --- Wire Link Selection --- p.48Chapter 4.3 --- Path Expansion in Non-tree Algorithm --- p.51Chapter 4.4 --- Limitations of the Non-tree Algorithm --- p.51Chapter 4.5 --- Experimental Results --- p.51Chapter 4.5.1 --- Experiment Setup --- p.51Chapter 4.5.2 --- Validations of the Delay and Slew Estimation --- p.52Chapter 4.5.3 --- Lowest Achievable Delays --- p.53Chapter 4.5.4 --- Results on New Benchmarks --- p.53Chapter 4.5.5 --- Simulation Results --- p.55Chapter 5 --- Efficient Partitioning-based Extension --- p.57Chapter 5.1 --- Introduction --- p.57Chapter 5.2 --- Partition-based Extension --- p.58Chapter 5.3 --- Experimental Results --- p.61Chapter 5.3.1 --- Experiment Setup --- p.61Chapter 5.3.2 --- Running Time Improvement with Partitioning Technique --- p.61Chapter 6 --- Conclusion --- p.63Bibliography --- p.6

CUHK Digital Repository

Photonics design tool for advanced CMOS nodes

Author: Alloatti Luca
Popovic Milos
Ram Rajeev Jagga
Stojanovic Vladimir
Wade Mark
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/04/2015
Field of study

Recently, the authors have demonstrated large-scale integrated systems with several million transistors and hundreds of photonic elements. Yielding such large-scale integrated systems requires a design-for-manufacture rigour that is embodied in the 10 000 to 50 000 design rules that these designs must comply within advanced complementary metal-oxide semiconductor manufacturing. Here, the authors present a photonic design automation tool which allows automatic generation of layouts without design-rule violations. This tool is written in SKILL, the native language of the mainstream electric design automation software, Cadence. This allows seamless integration of photonic and electronic design in a single environment. The tool leverages intuitive photonic layer definitions, allowing the designer to focus on the physical properties rather than on technology-dependent details. For the first time the authors present an algorithm for removal of design-rule violations from photonic layouts based on Manhattan discretisation, Boolean and sizing operations. This algorithm is not limited to the implementation in SKILL, and can in principle be implemented in any scripting language. Connectivity is achieved with software-defined waveguide ports and low-level procedures that enable auto-routing of waveguide connections.Comment: 5 pages, 10 figure

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Design methodology and productivity improvement in high speed VLSI circuits

Author: Hossain KM Mozammel
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2017
Field of study

2017 Spring.Includes bibliographical references.To view the abstract, please see the full text of the document

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Recommended from our members

Design and Implementation of a High Performance Network Processor with Dynamic Workload Management

Author: Duggisetty Padmaja
Publication venue: ScholarWorks@UMass Amherst
Publication date: 23/11/2015
Field of study

Internet plays a crucial part in today\u27s world. Be it personal communication, business transactions or social networking, internet is used everywhere and hence the speed of the communication infrastructure plays an important role. As the number of users increase the network usage increases i.e., the network data rates ramped up from a few Mb/s to Gb/s in less than a decade. Hence the network infrastructure needed a major upgrade to be able to support such high data rates. Technological advancements have enabled the communication links like optical fibres to support these high bandwidths, but the processing speed at the nodes remained constant. This created a need for specialised devices for packet processing in order to match the increasing line rates which led to emergence of network processors. Network processors were both programmable and flexible. To support the growing number of internet applications, a single core network processor has transformed into a multi/many core network processor with multiple cores on a single chip rather than just one core. This improved the packet processing speeds and hence the performance of a network node. Multi-core network processors catered to the needs of a high bandwidth networks by exploiting the inherent packet-level parallelism in a network. But these processors still had intrinsic challenges like load balancing. In order to maximise throughput of these multi-core network processors, it is important to distribute the traffic evenly across all the cores. This thesis describes a multi-core network processor with dynamic workload management. A multi-core network processor, which performs multiple applications is designed to act as a test bed for an effective workload management algorithm. An effective workload management algorithm is designed in order to distribute the workload evenly across all the available cores and hence maximise the performance of the network processor. Runtime statistics of all the cores were collected and updated at run time to aid in deciding the application to be performed on a core to to enable even distribution of workload among the cores. Hence, when an overloading of a core is detected, the applications to be performed on the cores are re-assigned. For testing purposes, we built a flexible and a reusable platform on NetFPGA 10G board which uses a FPGA-based approach to prototyping network devices. The performance of the designed workload management algorithm is tested by measuring the throughput of the system for varying workloads

ScholarWorks@UMass Amherst

SPROC: A multiple-processor DSP IC

Author: Davis R.
Publication venue
Publication date
Field of study

A large, single-chip, multiple-processor, digital signal processing (DSP) integrated circuit (IC) fabricated in HP-Cmos34 is presented. The innovative architecture is best suited for analog and real-time systems characterized by both parallel signal data flows and concurrent logic processing. The IC is supported by a powerful development system that transforms graphical signal flow graphs into production-ready systems in minutes. Automatic compiler partitioning of tasks among four on-chip processors gives the IC the signal processing power of several conventional DSP chips

NASA Technical Reports Server

Dynamically Reconfigurable Systolic Array Accelerators: A Case Study with Extended Kalman Filter and Discrete Wavelet Transform Algorithms

Author: Barnes Robert C
Publication venue: DigitalCommons@USU
Publication date: 01/05/2009
Field of study

Field programmable grid arrays (FPGA) are increasingly being adopted as the primary on-board computing system for autonomous deep space vehicles. There is a need to support several complex applications for navigation and image processing in a rapidly responsive on-board FPGA-based computer. This requires exploring and combining several design concepts such as systolic arrays, hardware-software partitioning, and partial dynamic reconfiguration. A microprocessor/co-processor design that can accelerate two single precision oating-point algorithms, extended Kalman lter and a discrete wavelet transform, is presented. This research makes three key contributions. (i) A polymorphic systolic array framework comprising of recofigurable partial region-based sockets to accelerate algorithms amenable to being mapped onto linear systolic arrays. When implemented on a low end Xilinx Virtex4 SX35 FPGA the design provides a speedup of at least 4.18x and 6.61x over a state of the art microprocessor used in spacecraft systems for the extended Kalman lter and discrete wavelet transform algorithms, respectively. (ii) Switchboxes to enable communication between static and partial reconfigurable regions and a simple protocol to enable schedule changes when a socket\u27s contents are dynamically reconfigured to alter the concurrency of the participating systolic arrays. (iii) A hybrid partial dynamic reconfiguration method that combines Xilinx early access partial reconfiguration, on-chip bitstream decompression, and bitstream relocation to enable fast scaling of systolic arrays on the PolySAF. This technique provided a 2.7x improvement in reconfiguration time compared to an o-chip partial reconfiguration technique that used a Flash card on the FPGA board, and a 44% improvement in BRAM usage compared to not using compression

DigitalCommons@USU

Globally asynchronous locally synchronous configurable array architecture for algorithm embeddings

Author: Gao Bo
Publication venue: The University of Edinburgh
Publication date: 01/01/1996
Field of study

Edinburgh Research Archive

Concepts of Communication and Synchronization in FPGA-Based Embedded Multiprocessor Systems

Author: David Antonio-Torres
Publication venue: 'IntechOpen'
Publication date: 16/03/2012
Field of study

IntechOpen