7,444 research outputs found

    High-Performance low-vcc in-order core

    Get PDF
    Power density grows in new technology nodes, thus requiring Vcc to scale especially in mobile platforms where energy is critical. This paper presents a novel approach to decrease Vcc while keeping operating frequency high. Our mechanism is referred to as immediate read after write (IRAW) avoidance. We propose an implementation of the mechanism for an Intel® SilverthorneTM in-order core. Furthermore, we show that our mechanism can be adapted dynamically to provide the highest performance and lowest energy-delay product (EDP) at each Vcc level. Results show that IRAW avoidance increases operating frequency by 57% at 500mV and 99% at 400mV with negligible area and power overhead (below 1%), which translates into large speedups (48% at 500mV and 90% at 400mV) and EDP reductions (0.61 EDP at 500mV and 0.33 at 400mV).Peer ReviewedPostprint (published version

    Feasibility, Architecture and Cost Considerations of Using TVWS for Rural Internet Access in 5G

    Get PDF
    The cellular technology is mostly an urban technology that has been unable to serve rural areas well. This is because the traditional cellular models are not economical for areas with low user density and lesser revenues. In 5G cellular networks, the coverage dilemma is likely to remain the same, thus widening the rural-urban digital divide further. It is about time to identify the root cause that has hindered the rural technology growth and analyse the possible options in 5G architecture to address this issue. We advocate that it can only be accomplished in two phases by sequentially addressing economic viability followed by performance progression. We deliberate how various works in literature focus on the later stage of this ‘two-phase’ problem and are not feasible to implement in the first place. We propose the concept of TV band white space (TVWS) dovetailed with 5G infrastructure for rural coverage and show that it can yield cost-effectiveness from a service provider’s perspective

    A framework for FPGA functional units in high performance computing

    Get PDF
    FPGAs make it practical to speed up a program by defining hardware functional units that perform calculations faster than can be achieved in software. Specialised digital circuits avoid the overhead of executing sequences of instructions, and they make available the massive parallelism of the components. The FPGA operates as a coprocessor controlled by a conventional computer. An application that combines software with hardware in this way needs an interface between a communications port to the processor and the signals connected to the functional units. We present a framework that supports the design of such systems. The framework consists of a generic controller circuit defined in VHDL that can be configured by the user according to the needs of the functional units and the I/O channel. The controller contains a register file and a pipelined programmable register transfer machine, and it supports the design of both stateless and stateful functional units. Two examples are described: the implementation of a set of basic stateless arithmetic functional units, and the implementation of a stateful algorithm that exploits circuit parallelism

    Hardware support for Local Memory Transactions on GPU Architectures

    Get PDF
    Graphics Processing Units (GPUs) are popular hardware accelerators for data-parallel applications, enabling the execution of thousands of threads in a Single Instruction - Multiple Thread (SIMT) fashion. However, the SIMT execution model is not efficient when code includes critical sections to protect the access to data shared by the running threads. In addition, GPUs offer two shared spaces to the threads, local memory and global memory. Typical solutions to thread synchronization include the use of atomics to implement locks, the serialization of the execution of the critical section, or delegating the execution of the critical section to the host CPU, leading to suboptimal performance. In the multi-core CPU world, transactional memory (TM) was proposed as an alternative to locks to coordinate concurrent threads. Some solutions for GPUs started to appear in the literature. In contrast to these earlier proposals, our approach is to design hardware support for TM in two levels. The first level is a fast and lightweight solution for coordinating threads that share the local memory, while the second level coordinates threads through the global memory. In this paper we present GPU-LocalTM as a hardware TM (HTM) support for the first level. GPU-LocalTM offers simple conflict detection and version management mechanisms that minimize the hardware resources required for its implementation. For the workloads studied, GPU-LocalTM provides between 1.25-80X speedup over serialized critical sections, while the overhead introduced by transaction management is lower than 20%.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Deductive Verification of Parallel Programs Using Why3

    Full text link
    The Message Passing Interface specification (MPI) defines a portable message-passing API used to program parallel computers. MPI programs manifest a number of challenges on what concerns correctness: sent and expected values in communications may not match, resulting in incorrect computations possibly leading to crashes; and programs may deadlock resulting in wasted resources. Existing tools are not completely satisfactory: model-checking does not scale with the number of processes; testing techniques wastes resources and are highly dependent on the quality of the test set. As an alternative, we present a prototype for a type-based approach to programming and verifying MPI like programs against protocols. Protocols are written in a dependent type language designed so as to capture the most common primitives in MPI, incorporating, in addition, a form of primitive recursion and collective choice. Protocols are then translated into Why3, a deductive software verification tool. Source code, in turn, is written in WhyML, the language of the Why3 platform, and checked against the protocol. Programs that pass verification are guaranteed to be communication safe and free from deadlocks. We verified several parallel programs from textbooks using our approach, and report on the outcome.Comment: In Proceedings ICE 2015, arXiv:1508.0459

    Design of a 4.2-5.4 GHz Differential LC VCO using 0.35 m SiGe BiCMOS Technology

    Get PDF
    In this paper, a 4.2-5.4 GHz, Gm LC voltage controlled oscillator (VCO) for IEEE 802.11a standard is presented. The circuit is designed with AMS 0.35´m SiGe BiCMOS process that includes high-speed SiGe Heterojunction Bipolar Transistors (HBTs). Phase noise is -110.7 dBc/Hz at 1MHz offset from 5.4 GHz carrier frequency and -113.5 dBc/Hz from 4.2 GHz carrier frequency. A linear, 1200 MHz tuning range is obtained utilizing accumulation-mode varactors. Phase noise is relatively low due to taking the advantage of differential tuning concept. Output power of the fundamental frequency changes between 4.8 dBm and 5.5 dBm depending on the tuning voltage. The circuit draws 2 mA without buffers and 14.5 mA from 2.5 V supply including buffer circuits leading to a total power dissipation of 36.25 mW. The circuit occupies an area of 0.6 mm2 on Si substrate including RF and DC pads

    Young stellar populations in early-type dwarf galaxies; occurrence, radial extent and scaling relations

    Get PDF
    To understand the stellar population content of dwarf early-type galaxies (dEs) and its environmental dependence, we compare the slopes and intrinsic scatter of color-magnitude relations (CMRs) for three nearby clusters, Fornax, Virgo and Coma. Additionally we present and compare internal color profiles of these galaxies to identify central blue regions with younger stars. We use the imaging of the HST/ACS Fornax cluster in the magnitude range of -18.7 <= M_g' <= -16.0, to derive magnitudes, colors and color profiles, which we compare with literature measurements. Based on analysis of the color profiles, we report a large number of dEs with young stellar populations in their center in all three clusters. While for Virgo and Coma the number of blue-cored dEs is found to be 85 +/- 2% and 53 +/- 3% respectively, for Fornax, we find that all galaxies have a blue core. We show that bluer cores reside in fainter dEs, similar to the trend seen in nucleated dEs. We find no correlation between the luminosity of the galaxy and the size of its blue core. Moreover, a comparison of the CMRs of the three clusters shows that the scatter in Virgo's CMR is considerably larger than in the Fornax and Coma clusters. Presenting adaptive smoothing we show that the galaxies on the blue side of the CMR often show evidence for dust extinction, which strengthens the interpretation that the bluer colors are due to young stellar populations. We also find that outliers on the red side of the CMR are more compact than expected for their luminosity. We find several of these red outliers in Virgo, often close to more massive galaxies. No red outlying compact early-types are found in Fornax and Coma in this magnitude range while we find three in the Virgo cluster. We suggest that the large number of outliers and larger scatter found for the Virgo cluster CMR is a result of Virgo's different assembly history.Comment: 24 pages, accepted for publication in Astronomy and Astrophysic

    The Next Generation Virgo Cluster Survey XVI. The Angular Momentum of Dwarf Early-Type Galaxies from Globular Cluster Satellites

    Full text link
    We analyze the kinematics of six Virgo cluster dwarf early-type galaxies (dEs) from their globular cluster (GC) systems. We present new Keck/DEIMOS spectroscopy for three of them and reanalyze the data found in the literature for the remaining three. We use two independent methods to estimate the rotation amplitude (Vmax) and velocity dispersion (sigma_GC) of the GC systems and evaluate their statistical significance by simulating non-rotating GC systems with the same number of GC satellites and velocity uncertainties. Our measured kinematics agree with the published values for the three galaxies from the literature and, in all cases, some rotation is measured. However, our simulations show that the null hypothesis of being non-rotating GC systems cannot be ruled out. In the case of VCC1861, the measured Vmax and the simulations indicate that it is not rotating. In the case of VCC1528, the null hypothesis can be marginally ruled out, thus, it might be rotating although further confirmation is needed. In our analysis, we find that, in general, the measured Vmax tends to be overestimated and the measured sigma_GC tends to be underestimated by amounts that depend on the intrinsic Vmax/sigma_GC, the number of observed GCs (N_GC), and the velocity uncertainties. The bias is negligible when N_GC>~20. In those cases where a large N_GC is not available, it is imperative to obtain data with small velocity uncertainties. For instance, errors of <2km/s lead to Vmax<10km/s for a system that is intrinsically not rotating.Comment: ApJ in press. 20 pages, 17 figures, 5 table

    The Birth of German Biotechnology Industry – Did Venture Capital run the show?

    Get PDF
    We answer the questions, how many firms acting in the modern German biotechnology industry are funded by venture capital companies (VCC) as well as equity funded by corporate investors. The theory suggests a high relevance of VCC as venturing partner of high-tech projects. In addition we argue that corporate investors are a venturing partner of firms with high-risk projects to a lower extent.Incumbents,however, are confronted with some opportunities in the low-risk area of the biotechnology industry to secure an optimal supply for the current product pipeline.Our empirical results emphasize a crucial importance of venture capital as financial resource for high-risk projects: whereas 42 percent of all healthcare developer in the early stage are venture-backed firms, only a small share of low-risk projects received venture capital. The results for corporate investors are reversible.Fewer high-tech projects and more low-risk projects compared to VCC are equity financed by corporate investors. The econometric analysis suggests that the observed pattern is mainly driven by the level of project risk and hence, supports all our hypotheses.Biotechnology, Start-ups, Venture Capital, Discrete Choice
    corecore