2,547 research outputs found

    Course grained low power design flow using UPF

    Get PDF
    Increased system complexity has led to the substitution of the traditional bottom-up design flow by systematic hierarchical design flow. The main motivation behind the evolution of such an approach is the increasing difficulty in hardware realization of complex systems. With decreasing channel lengths, few key problems such as timing closure, design sign-off, routing complexity, signal integrity, and power dissipation arise in the design flows. Specifically, minimizing power dissipation is critical in several high-end processors. In high-end processors, the design complexity contributes to the overall dynamic power while the decreasing transistor size results in static power dissipation. This research aims at optimizing the design flow for power and timing using the unified power format (UPF). UPF provides a strategic format to specify power-aware design information at every stage in the flow. The low power reduction techniques enforced in this research are multi-voltage, multi-threshold voltage (Vth), and power gating with state retention. An inherent design challenge addressed in this research is the choice of power optimization techniques as the flow advances from synthesis to physical design. A top-down digital design flow for a 32 bit MIPS RISC processor has been implemented with and without UPF synthesis flow for 65nm technology. The UPF synthesis is implemented with two voltages, 1.08V and 0.864V (Multi-VDD). Area, power and timing metrics are analyzed for the flows developed. Power savings of about 20 % are achieved in the design flow with \u27multi-threshold\u27 power technique compared to that of the design flow with no low power techniques employed. Similarly, 30 % power savings are achieved in the design flow with the UPF implemented when compared to that of the design flow with \u27multi-threshold\u27 power technique employed. Thus, a cumulative power savings of 42% has been achieved in a complete power efficient design flow (UPF) compared to that of the generic top-down standard flow with no power saving techniques employed. This is substantiated by the low voltage operation of modules in the design, reduction in clock switching power by gating clocks in the design and extensive use of HVT and LVT standard cells for implementation. The UPF synthesis flow saw the worst timing slack and more area when compared to those of the `multi-threshold\u27 or the generic flow. Percentage increase in the area with UPF is approximately 15%; a significant source for this increase being the additional power controlling logic added

    Design for testability of a latch-based design

    Get PDF
    Abstract. The purpose of this thesis was to decrease the area of digital logic in a power management integrated circuit (PMIC), by replacing selected flip-flops with latches. The thesis consists of a theory part, that provides background theory for the thesis, and a practical part, that presents a latch register design and design for testability (DFT) method for achieving an acceptable level of manufacturing fault coverage for it. The total area was decreased by replacing flip-flops of read-write and one-time programmable registers with latches. One set of negative level active primary latches were shared with all the positive level active latch registers in the same register bank. Clock gating was used to select which latch register the write data was loaded to from the primary latches. The latches were made transparent during the shift operation of partial scan testing. The observability of the latch register clock gating logic was improved by leaving the first bit of each latch register as a flip-flop. The controllability was improved by inserting control points. The latch register design, developed in this thesis, resulted in a total area decrease of 5% and a register bank area decrease of 15% compared to a flip-flop-based reference design. The latch register design manages to maintain the same stuck-at fault coverage as the reference design.SalpaperÀisen piirin testattavuuden suunnittelu. TiivistelmÀ. TÀmÀn opinnÀytetyön tarkoituksena oli pienentÀÀ digitaalisen logiikan pinta-alaa integroidussa tehonhallintapiirissÀ, korvaamalla valitut kiikut salpapiireillÀ. OpinnÀytetyö koostuu teoriaosasta, joka antaa taustatietoa opinnÀytetyölle, ja kÀytÀnnön osuudesta, jossa esitellÀÀn salparekisteripiiri ja testattavuussuunnittelun menetelmÀ, jolla saavutettiin riittÀvÀn hyvÀ virhekattavuus salparekisteripiirille. Kokonaispinta-alaa pienennettiin korvaamalla luku-kirjoitusrekistereiden ja kerran ohjelmoitavien rekistereiden kiikut salpapiireillÀ. Yhdet negatiivisella tasolla aktiiviset isÀntÀ-salpapiirit jaettiin kaikkien samassa rekisteripankissa olevien positiivisella tasolla aktiivisten salparekistereiden kanssa. Kellon portittamisella valittiin mihin salparekisteriin kirjoitusdata ladattiin yhteisistÀ isÀntÀ-salpapireistÀ. Osittaisessa testipolkuihin perustuvassa testauksessa salpapiirit tehtiin lÀpinÀkyviksi siirtooperaation aikana. Salparekisterin kellon portituslogiikan havaittavuutta parannettiin jÀttÀmÀllÀ jokaisen salparekisterin ensimmÀinen bitti kiikuksi. Ohjattavuutta parannettiin lisÀÀmÀllÀ ohjauspisteitÀ. Salparekisteripiiri, joka suunniteltiin tÀssÀ diplomityössÀ, pienensi kokonaispinta-alaa 5 % ja rekisteripankin pinta-alaa 15 % verrattuna kiikkuperÀiseen vertailupiiriin. Salparekisteripiiri onnistuu pitÀmÀÀn saman juuttumisvikamallin virhekattavuuden kuin vertailupiiri

    Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

    Get PDF
    The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

    Design-for-delay-testability techniques for high-speed digital circuits

    Get PDF
    The importance of delay faults is enhanced by the ever increasing clock rates and decreasing geometry sizes of nowadays' circuits. This thesis focuses on the development of Design-for-Delay-Testability (DfDT) techniques for high-speed circuits and embedded cores. The rising costs of IC testing and in particular the costs of Automatic Test Equipment are major concerns for the semiconductor industry. To reverse the trend of rising testing costs, DfDT is\ud getting more and more important

    Novel Multicarrier Memory Channel Architecture Using Microwave Interconnects: Alleviating the Memory Wall

    Get PDF
    abstract: The increase in computing power has simultaneously increased the demand for input/output (I/O) bandwidth. Unfortunately, the speed of I/O and memory interconnects have not kept pace. Thus, processor-based systems are I/O and interconnect limited. The memory aggregated bandwidth is not scaling fast enough to keep up with increasing bandwidth demands. The term "memory wall" has been coined to describe this phenomenon. A new memory bus concept that has the potential to push double data rate (DDR) memory speed to 30 Gbit/s is presented. We propose to map the conventional DDR bus to a microwave link using a multicarrier frequency division multiplexing scheme. The memory bus is formed using a microwave signal carried within a waveguide. We call this approach multicarrier memory channel architecture (MCMCA). In MCMCA, each memory signal is modulated onto an RF carrier using 64-QAM format or higher. The carriers are then routed using substrate integrated waveguide (SIW) interconnects. At the receiver, the memory signals are demodulated and then delivered to SDRAM devices. We pioneered the usage of SIW as memory channel interconnects and demonstrated that it alleviates the memory bandwidth bottleneck. We demonstrated SIW performance superiority over conventional transmission line in immunity to cross-talk and electromagnetic interference. We developed a methodology based on design of experiment (DOE) and response surface method techniques that optimizes the design of SIW interconnects and minimizes its performance fluctuations under material and manufacturing variations. Along with using SIW, we implemented a multicarrier architecture which enabled the aggregated DDR bandwidth to reach 30 Gbit/s. We developed an end-to-end system model in Simulink and demonstrated the MCMCA performance for ultra-high throughput memory channel. Experimental characterization of the new channel shows that by using judicious frequency division multiplexing, as few as one SIW interconnect is sufficient to transmit the 64 DDR bits. Overall aggregated bus data rate achieves 240 GBytes/s data transfer with EVM not exceeding 2.26% and phase error of 1.07 degree or less.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Timing Closure in Chip Design

    Get PDF
    Achieving timing closure is a major challenge to the physical design of a computer chip. Its task is to find a physical realization fulfilling the speed specifications. In this thesis, we propose new algorithms for the key tasks of performance optimization, namely repeater tree construction; circuit sizing; clock skew scheduling; threshold voltage optimization and plane assignment. Furthermore, a new program flow for timing closure is developed that integrates these algorithms with placement and clocktree construction. For repeater tree construction a new algorithm for computing topologies, which are later filled with repeaters, is presented. To this end, we propose a new delay model for topologies that not only accounts for the path lengths, as existing approaches do, but also for the number of bifurcations on a path, which introduce extra capacitance and thereby delay. In the extreme cases of pure power optimization and pure delay optimization the optimum topologies regarding our delay model are minimum Steiner trees and alphabetic code trees with the shortest possible path lengths. We presented a new, extremely fast algorithm that scales seamlessly between the two opposite objectives. For special cases, we prove the optimality of our algorithm. The efficiency and effectiveness in practice is demonstrated by comprehensive experimental results. The task of circuit sizing is to assign millions of small elementary logic circuits to elements from a discrete set of logically equivalent, predefined physical layouts such that power consumption is minimized and all signal paths are sufficiently fast. In this thesis we develop a fast heuristic approach for global circuit sizing, followed by a local search into a local optimum. Our algorithms use, in contrast to existing approaches, the available discrete layout choices and accurate delay models with slew propagation. The global approach iteratively assigns slew targets to all source pins of the chip and chooses a discrete layout of minimum size preserving the slew targets. In comprehensive experiments on real instances, we demonstrate that the worst path delay is within 7% of its lower bound on average after a few iterations. The subsequent local search reduces this gap to 2% on average. Combining global and local sizing we are able to size more than 5.7 million circuits within 3 hours. For the clock skew scheduling problem we develop the first algorithm with a strongly polynomial running time for the cycle time minimization in the presence of different cycle times and multi-cycle paths. In practice, an iterative local search method is much more efficient. We prove that this iterative method maximizes the worst slack, even when restricting the feasible schedule to certain time intervals. Furthermore, we enhance the iterative local approach to determine a lexicographically optimum slack distribution. The clock skew scheduling problem is then generalized to allow for simultaneous data path optimization. In fact, this is a time-cost tradeoff problem. We developed the first combinatorial algorithm for computing time-cost tradeoff curves in graphs that may contain cycles. Starting from the lowest-cost solution, the algorithm iteratively computes a descent direction by a minimum cost flow computation. The maximum feasible step length is then determined by a minimum ratio cycle computation. This approach can be used in chip design for several optimization tasks, e.g. threshold voltage optimization or plane assignment. Finally, the optimization routines are combined into a timing closure flow. Here, the global placement is alternated with global performance optimization. Netweights are used to penalize the length of critical nets during placement. After the global phase, the performance is improved further by applying more comprehensive optimization routines on the most critical paths. In the end, the clock schedule is optimized and clocktrees are inserted. Computational results of the design flow are obtained on real-world computer chips

    Multirate control with incomplete information over Profibus-DP network

    Full text link
    This is an Accepted Manuscript of an article published by Taylor & Francis in International Journal of Systems Science on 2014, available online:http://www.tandfonline.com/10.1080/00207721.2013.844286When a process Âżeld bus-decentralized peripherals (ProÂżbus-DP) network is used in an industrial environment, a deterministic behaviour is usually claimed. However, due to some concerns such as bandwidth limitations, lack of synchronisation among different clocks and existence of time-varying delays, a more complex problem must be faced. This problem implies the transmission of irregular and, even, random sequences of incomplete information. The main consequence of this issue is the appearance of different sampling periods at different network devices. In this paper, this aspect is checked by means of a detailed ProÂżbus-DP timescale study. In addition, in order to deal with the different periods, a delay-dependent dual-rate proportional-integral-derivative control is introduced. Stability for the proposed control system is analysed in terms of linear matrix inequalitiesThe authors are grateful to the financial support of the Spanish Ministry of Economy and Competitivity [Research Grant TEC2012-31506].Salt Llobregat, JJ.; Casanova Calvo, V.; Cuenca Lacruz, ÁM.; PizĂĄ FernĂĄndez, R. (2014). Multirate control with incomplete information over Profibus-DP network. International Journal of Systems Science. 45(7):1589-1605. https://doi.org/10.1080/00207721.2013.844286S15891605457Alves, M., & Tovar, E. (2007). Real-time communications over wired/wireless PROFIBUS networks supporting inter-cell mobility. Computer Networks, 51(11), 2994-3012. doi:10.1016/j.comnet.2007.01.001Boyd, S., El Ghaoui, L., Feron, E., & Balakrishnan, V. (1994). Linear Matrix Inequalities in System and Control Theory. doi:10.1137/1.9781611970777Bucher, R., & Balemi, S. (2006). Rapid controller prototyping with Matlab/Simulink and Linux. Control Engineering Practice, 14(2), 185-192. doi:10.1016/j.conengprac.2004.09.009Casanova, V., & Salt, J. (2003). Multirate control implementation for an integrated communication and control system. Control Engineering Practice, 11(11), 1335-1348. doi:10.1016/s0967-0661(02)00256-3Lee, J., Jung, W., Kang, I., Kim, Y., & Lee, G. (2004). Design of filter to reject motion artifact of pulse oximetry. Computer Standards & Interfaces, 26(3), 241-249. doi:10.1016/s0920-5489(03)00077-1Cuenca, Á., PizĂĄ, R., Salt, J., & Sala, A. (2012). Linear Matrix Inequalities in Multirate Control over Networks. Mathematical Problems in Engineering, 2012, 1-22. doi:10.1155/2012/768212Cuenca, A., & Salt, J. (2012). RST controller design for a non-uniform multi-rate control system. Journal of Process Control, 22(10), 1865-1877. doi:10.1016/j.jprocont.2012.09.010Cuenca, Á., Salt, J., & Albertos, P. (2006). Implementation of algebraic controllers for non-conventional sampled-data systems. Real-Time Systems, 35(1), 59-89. doi:10.1007/s11241-006-9001-2Halevi, Y., & Ray, A. (1988). Integrated Communication and Control Systems: Part I—Analysis. Journal of Dynamic Systems, Measurement, and Control, 110(4), 367-373. doi:10.1115/1.3152698Khargonekar, P., Poolla, K., & Tannenbaum, A. (1985). Robust control of linear time-invariant plants using periodic compensation. IEEE Transactions on Automatic Control, 30(11), 1088-1096. doi:10.1109/tac.1985.1103841Lall, S., & Dullerud, G. (2001). An LMI solution to the robust synthesis problem for multi-rate sampled-data systems. Automatica, 37(12), 1909-1922. doi:10.1016/s0005-1098(01)00167-4Lee, I. W. C., & Dash, P. K. (2003). S-transform-based intelligent system for classification of power quality disturbance signals. IEEE Transactions on Industrial Electronics, 50(4), 800-805. doi:10.1109/tie.2003.814991Lee, C. K., Ron Hui, S. Y., & Henry Shu-Hung Chung. (2002). A 31-level cascade inverter for power applications. IEEE Transactions on Industrial Electronics, 49(3), 613-617. doi:10.1109/tie.2002.1005388Performance evaluation of control networks: Ethernet, ControlNet, and DeviceNet. (2001). IEEE Control Systems, 21(1), 66-83. doi:10.1109/37.898793Feng-Li Lian, Moyne, J., & Tilbury, D. (2002). Network design consideration for distributed control systems. IEEE Transactions on Control Systems Technology, 10(2), 297-307. doi:10.1109/87.987076Lin, J., Fei, S., & Gao, Z. (2013). Control discrete-time switched singular systems with state delays under asynchronous switching. International Journal of Systems Science, 44(6), 1089-1101. doi:10.1080/00207721.2011.652230Liou, L.-W., & Ray, A. (1991). A Stochastic Regulator for Integrated Communication and Control Systems: Part I—Formulation of Control Law. Journal of Dynamic Systems, Measurement, and Control, 113(4), 604-611. doi:10.1115/1.2896464Lorand, C., & Bauer, P. H. (2006). On Synchronization Errors in Networked Feedback Systems. IEEE Transactions on Circuits and Systems I: Regular Papers, 53(10), 2306-2317. doi:10.1109/tcsi.2006.882824Moayedi, M., Foo, Y. K., & Soh, Y. C. (2011). Filtering for networked control systems with single/multiple measurement packets subject to multiple-step measurement delays and multiple packet dropouts. International Journal of Systems Science, 42(3), 335-348. doi:10.1080/00207720903513335Peñarrocha, I., Sanchis, R., & Romero, J. A. (2012). State estimator for multisensor systems with irregular sampling and time-varying delays. International Journal of Systems Science, 43(8), 1441-1453. doi:10.1080/00207721.2011.625482Piza, R., Salt, J., Sala, A., & Cuenca, A. (2014). Hierarchical Triple-Maglev Dual-Rate Control Over a Profibus-DP Network. IEEE Transactions on Control Systems Technology, 22(1), 1-12. doi:10.1109/tcst.2012.2222883Ray, A. (1989). Introduction to networking for integrated control systems. IEEE Control Systems Magazine, 9(1), 76-79. doi:10.1109/37.16755Ray, A., & Halevi, Y. (1988). Integrated Communication and Control Systems: Part II—Design Considerations. Journal of Dynamic Systems, Measurement, and Control, 110(4), 374-381. doi:10.1115/1.3152699Sala, A., Cuenca, Á., & Salt, J. (2009). A retunable PID multi-rate controller for a networked control system. Information Sciences, 179(14), 2390-2402. doi:10.1016/j.ins.2009.02.017Salt, J., & Albertos, P. (2005). Model-based multirate controllers design. IEEE Transactions on Control Systems Technology, 13(6), 988-997. doi:10.1109/tcst.2005.857410Salt, J., Sala, A., & Albertos, P. (2011). A Transfer-Function Approach to Dual-Rate Controller Design for Unstable and Non-Minimum-Phase Plants. IEEE Transactions on Control Systems Technology, 19(5), 1186-1194. doi:10.1109/tcst.2010.2076386Schickhuber, G., & McCarthy, O. (1997). Distributed Fieldbus and control network systems. Computing & Control Engineering Journal, 8(1), 21-32. doi:10.1049/cce:19970106Sturm, J. F. (1999). Using SeDuMi 1.02, A Matlab toolbox for optimization over symmetric cones. Optimization Methods and Software, 11(1-4), 625-653. doi:10.1080/10556789908805766Tipsuwan, Y., & Chow, M.-Y. (2003). Control methodologies in networked control systems. Control Engineering Practice, 11(10), 1099-1111. doi:10.1016/s0967-0661(03)00036-4Xie, L. B., Ozkul, S., Sawant, M., Shieh, L. S., Tsai, J. S. H., & Tsai, C. H. (2013). Multi-rate digital redesign of cascaded and dynamic output feedback systems. International Journal of Systems Science, 45(8), 1757-1768. doi:10.1080/00207721.2012.752546Yang, T. C. (2006). Networked control system: a brief survey. IEE Proceedings - Control Theory and Applications, 153(4), 403-412. doi:10.1049/ip-cta:2005017
    • 

    corecore