47 research outputs found
Hardware cryptographic support of IBM z Systems for OpenSSH in RHEL 7.2 and SLES 12 SP1
Abstract This article summarizes our experiences with the configuration and usage of OpenSSH using hardware cryptographic support of IBM z Systems. We report our findings in the areas of performance and throughput improvement. Our positive experience indicates that you should make use of this capability when using OpenSSH
Recommended from our members
Active timing margin management to improve microprocessor power efficiency
Improving power/performance efficiency is critical for today’s micro- processors. From edge devices to datacenters, lower power or higher performance always produces better systems, measured by lower cost of ownership or longer battery time. This thesis studies improving microprocessor power/performance efficiency by optimizing the pipeline timing margin. In particular, this thesis focuses on improving the efficacy of Active Timing Margin, a young technology that dynamically adjusts the margin.
Active timing margin trims down the pipeline timing margin with a control loop that adjusts voltage and frequency based on real-time chip environment monitoring. The key insight of this thesis is that in order to maximize active timing margin’s efficiency enhancement benefits, synergistic management from processor architecture design and system software scheduling are needed. To that end, this thesis covers the major consumers of pipeline timing margin, including temperature, voltage, and process variation. For temperature variation, the thesis proposes a table-lookup based active timing margin mechanism, and an associated temperature management scheme to minimize power consumption. For voltage variation, the thesis characterizes the limiting factors of adaptive clocking’s power saving and proposes application scheduling to maximize total system power reduction. For process variation, the thesis proposes core-level adaptive clocking reconfiguration to automatically expose inter-core variation and discusses workload scheduling and throttling management to control critical application performance.
The author believes the optimization presented in this thesis can potentially benefit a variety of processor architectures as the conclusions are based on the solid measurement on state-of-the-art processors, and the research objective, active timing margin, already has wide applicability in the latest microprocessors by the time this thesis is written.Electrical and Computer Engineerin
Physics-based equivalent circuit model extraction for system level PDN and a novel PDN impedance measurement method
“The power distribution network (PDN) plays an important role in the power supply system, especially with the increasing of the working frequency of the integrated circuit (IC). A physics-based circuit modeling methodology is proposed in the first section. The circuit model is extracted by following the current path in the system PDN and the related parameters are calculated based on the cavity model and plane-pair PEEC methods. By extracting the equivalent circuit model, the PDN system will be transformed into RLC element-based circuit. The role of each part of the system will be easily explained and the system behavior could be changed by changing the dominance part accordingly. This methodology makes a good contribution to the system level PDN troubleshooting and layout design optimization.
Compared with analytical methodologies, the measurement result is more solid and convincing. The special part of PDN is that the impedance could be as low as several milliohms, and the impedance varies during the frequency, so the accuracy of impedance measurement is challenging. Based on all these requirements, a novel PDN low impedance measurement methodology is proposed, and a probe based on I-V method is designed to support this methodology, which provides a new and practical approach of PDN impedance measurement with easy landing, simple setup, lower frequency, and less instrument quality dependent advantages. This probe could work in a wide frequency range with a relatively sufficient dynamic range”--Abstract, page iii
Achieving fault tolerance on capped color codes with few ancillas
Attaining fault tolerance while maintaining low overhead is one of the main
challenges in a practical implementation of quantum circuits. One major
technique that can overcome this problem is the flag technique, in which
high-weight errors arising from a few faults can be detected by a few ancillas
and distinguished using subsequent syndrome measurements. The technique can be
further improved using the fact that for some families of codes, errors of any
weight are logically equivalent if they have the same syndrome and weight
parity, as previously shown in [arXiv:2006.03068]. In this work, we develop a
notion of distinguishable fault set which capture both concepts of flags and
weight parities, and extend the use of weight parities in error correction from
[arXiv:2006.03068] to a family of capped color codes. We also develop
fault-tolerant protocols for error correction, measurement, and state
preparation, which are sufficient for implementing any Clifford operation
fault-tolerantly on a capped color code. Our protocols for capped color codes
of distance 3, 5, and 7 require only 1, 1, and 2 ancillas. The concept of
distinguishable fault set also leads to a generalization of the definitions of
fault-tolerant gadgets proposed by Aliferis, Gottesman, and Preskill.Comment: 39 pages, 13 figures, comments welcome. V2 & V3: minor revisio
A cross-stack, network-centric architectural design for next-generation datacenters
This thesis proposes a full-stack, cross-layer datacenter architecture based on in-network computing and near-memory processing paradigms. The proposed datacenter architecture is built atop two principles: (1) utilizing commodity, off-the-shelf hardware (i.e., processor, DRAM, and network devices) with minimal changes to their architecture, and (2) providing a standard interface to the programmers for using the novel hardware. More specifically, the proposed datacenter architecture enables a smart network adapter to collectively compress/decompress data exchange between distributed DNN training nodes and assist the operating system in performing aggressive processor power management. It also deploys specialized memory modules in the servers, capable of performing general-purpose computation and network connectivity.
This thesis unlocks the potentials of hardware and operating system co-design in architecting application-transparent, near-data processing hardware for improving datacenter's performance, energy efficiency, and scalability. We evaluate the proposed datacenter architecture using a combination of full-system simulation, FPGA prototyping, and real-system experiments
Fault-tolerant quantum computer architectures using hierarchies of quantum error-correcting codes
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 221-238).Quantum computers have been shown to efficiently solve a class of problems for which no efficient solution is otherwise known. Physical systems can implement quantum computation, but devising realistic schemes is an extremely challenging problem largely due to the effect of noise. A quantum computer that is capable of correctly solving problems more rapidly than modern digital computers requires some use of so-called fault-tolerant components. Code-based fault-tolerance using quantum error-correcting codes is one of the most promising and versatile of the known routes for fault-tolerant quantum computation. This dissertation presents three main, new results about code-based fault-tolerant quantum computer architectures. The first result is a large new family of quantum codes that go beyond stabilizer codes, the most well-studied family of quantum codes. Our new family of codeword stabilized codes contains all known codes with optimal parameters. Furthermore, we show how to systematically find, construct, and understand such codes as a pair of codes: an additive quantum code and a classical (nonlinear) code. Second, we resolve an open question about universality of so-called transversal gates acting on stabilizer codes. Such gates are universal for classical fault-tolerant computation, but they were conjectured to be insufficient for universal fault-tolerant quantum computation. We show that transversal gates have a restricted form and prove that some important families of them cannot be quantum universal. This is strong evidence that so-called quantum software is necessary to achieve universality, and, therefore, fault-tolerant quantum computer architecture is fundamentally different from classical computer architecture. Finally, we partition the fault-tolerant design problem into levels of a hierarchy of concatenated codes and present methods, compatible with rigorous threshold theorems, for numerically evaluating these codes.(cont.) The methods are applied to measure inner error-correcting code performance, as a first step toward elucidation of an effective fault-tolerant quantum computer architecture that uses no more than a physical, inner, and outer level of coding. Of the inner codes, the Golay code gives the highest pseudothreshold of 2 x 10-3. A comparison of logical error rate and overhead shows that the Bacon-Shor codes are competitive with Knill's C₄/C₆ scheme at a base error rate of 10⁻⁴.by Andrew W. Cross.Ph.D