1,949 research outputs found

    Energy Aware Runtime Systems for Elastic Stream Processing Platforms

    Get PDF
    Following an invariant growth in the required computational performance of processors, the multicore revolution started around 20 years ago. This revolution was mainly an answer to power dissipation constraints restricting the increase of clock frequency in single-core processors. The multicore revolution not only brought in the challenge of parallel programming, i.e. being able to develop software exploiting the entire capabilities of manycore architectures, but also the challenge of programming heterogeneous platforms. The question of “on which processing element to map a specific computational unit?”, is well known in the embedded community. With the introduction of general-purpose graphics processing units (GPGPUs), digital signal processors (DSPs) along with many-core processors on different system-on-chip platforms, heterogeneous parallel platforms are nowadays widespread over several domains, from consumer devices to media processing platforms for telecom operators. Finding mapping together with a suitable hardware architecture is a process called design-space exploration. This process is very challenging in heterogeneous many-core architectures, which promise to offer benefits in terms of energy efficiency. The main problem is the exponential explosion of space exploration. With the recent trend of increasing levels of heterogeneity in the chip, selecting the parameters to take into account when mapping software to hardware is still an open research topic in the embedded area. For example, the current Linux scheduler has poor performance when mapping tasks to computing elements available in hardware. The only metric considered is CPU workload, which as was shown in recent work does not match true performance demands from the applications. Doing so may produce an incorrect allocation of resources, resulting in a waste of energy. The origin of this research work comes from the observation that these approaches do not provide full support for the dynamic behavior of stream processing applications, especially if these behaviors are established only at runtime. This research will contribute to the general goal of developing energy-efficient solutions to design streaming applications on heterogeneous and parallel hardware platforms. Streaming applications are nowadays widely spread in the software domain. Their distinctive characiteristic is the retrieving of multiple streams of data and the need to process them in real time. The proposed work will develop new approaches to address the challenging problem of efficient runtime coordination of dynamic applications, focusing on energy and performance management.Efter en oförĂ€nderlig tillvĂ€xt i prestandakrav hos processorer, började den flerkĂ€rniga processor-revolutionen för ungefĂ€r 20 Ă„r sedan. Denna revolution skedde till största del som en lösning till begrĂ€nsningar i energieffekten allt eftersom klockfrekvensen kontinuerligt höjdes i en-kĂ€rniga processorer. Den flerkĂ€rniga processor-revolutionen medförde inte enbart utmaningen gĂ€llande parallellprogrammering, m.a.o. förmĂ„gan att utveckla mjukvara som anvĂ€nder sig av alla delelement i de flerkĂ€rniga processorerna, men ocksĂ„ utmaningen med programmering av heterogena plattformar. FrĂ„gestĂ€llningen ”pĂ„ vilken processorelement skall en viss berĂ€kning utföras?” Ă€r vĂ€l kĂ€nt inom ramen för inbyggda datorsystem. Efter introduktionen av grafikprocessorer för allmĂ€nna berĂ€kningar (GPGPU), signalprocesserings-processorer (DSP) samt flerkĂ€rniga processorer pĂ„ olika system-on-chip plattformar, Ă€r heterogena parallella plattformar idag omfattande inom mĂ„nga domĂ€ner, frĂ„n konsumtionsartiklar till mediaprocesseringsplattformar för telekommunikationsoperatörer. Processen att placera berĂ€kningarna pĂ„ en passande hĂ„rdvaruplattform kallas för utforskning av en designrymd (design-space exploration). Denna process Ă€r mycket utmanande för heterogena flerkĂ€rniga arkitekturer, och kan medföra fördelar nĂ€r det gĂ€ller energieffektivitet. Det största problemet Ă€r att de olika valmöjligheterna i designrymden kan vĂ€xa exponentiellt. Enligt den nuvarande trenden som förespĂ„r ökad heterogeniska aspekter i processorerna Ă€r utmaningen att hitta den mest passande placeringen av berĂ€kningarna pĂ„ hĂ„rdvaran Ă€nnu en forskningsfrĂ„ga inom ramen för inbyggda datorsystem. Till exempel, den nuvarande schemalĂ€ggaren i Linux operativsystemet Ă€r inkapabel att hitta en effektiv placering av berĂ€kningarna pĂ„ den underliggande hĂ„rdvaran. Det enda mĂ€tsĂ€ttet som anvĂ€nds Ă€r processorns belastning vilket, som visats i tidigare forskning, inte motsvarar den verkliga prestandan i applikationen. AnvĂ€ndning av detta mĂ€tsĂ€tt vid resursallokering resulterar i slöseri med energi. Denna forskning hĂ€rstammar frĂ„n observationerna att dessa tillvĂ€gagĂ„ngssĂ€tt inte stöder det dynamiska beteendet hos ström-processeringsapplikationer (stream processing applications), speciellt om beteendena bara etableras vid körtid. Denna forskning kontribuerar till det allmĂ€nna mĂ„let att utveckla energieffektiva lösningar för ström-applikationer (streaming applications) pĂ„ heterogena flerkĂ€rniga hĂ„rdvaruplattformar. Ström-applikationer Ă€r numera mycket vanliga i mjukvarudomĂ€n. Deras distinkta karaktĂ€r Ă€r inlĂ€sning av flertalet dataströmmar, och behov av att processera dem i realtid. Arbetet i denna forskning understöder utvecklingen av nya sĂ€tt för att lösa det utmanade problemet att effektivt koordinera dynamiska applikationer i realtid och fokus pĂ„ energi- och prestandahantering

    Single-photon detectors integrated in quantum photonic circuits

    Get PDF
    Toward photonic circuits for quantum computer

    Improving the Readout of Semiconducting Qubits

    Get PDF
    Semiconducting qubits are a promising platform for quantum computers. In particular, silicon spin qubits have made a number of advancements recently including long coherence times, high-fidelity single-qubit gates, two-qubit gates, and high-fidelity readout. However, all operations likely require improvement in fidelity and speed, if possible, to realize a quantum computer. Readout fidelity and speed, in general, are limited by circuit challenges centered on extracting low signal from a device in a dilution refrigerator connected to room temperature amplifiers by long coaxial cables with relatively high capacitance. Readout fidelity specifically is limited by the time it takes to reliably distinguish qubit states relative to the characteristic decay time of the excited state, T1. This dissertation explores the use of heterojunction bipolar transistor (HBT) circuits to amplify the readout signal of silicon spin qubits at cryogenic temperatures. The cryogenic amplification approach has numerous advantages including low implementation overhead, low power relative to the available cooling power, and high signal gain at the mixing chamber stage leading to around a factor of ten speedup in readout time for a similar signal-to-noise ratio. The faster readout time generally increases fidelity, since it is much faster than the T1 time. Two HBT amplification circuits have been designed and characterized. One design is a low-power, base-current biased configuration with non-linear gain (CB-HBT), and the second is a linear-gain, AC-coupled configuration (AC-HBT). They can operate at powers of 1 and 10 ÎŒW, respectfully, and not significantly heat electrons. The noise spectral density referred to the input for both circuits is around 15 to 30 fA/√Hz, which is low compared to previous cases such as the dual-stage, AC-coupled HEMT circuit at ~ 70 fA/√Hz. Both circuits achieve charge sensitivity between 300 and 400 ÎŒe/√Hz, which approaches the best alternatives (e.g., RF-SET at ~ 140 ÎŒe/√Hz) but with much less implementation overhead. For the single-shot latched charge readout performed, both circuits achieve high-fidelity readout in times \u3c 10 ÎŒs with bit error rates \u3c 10-3, which is a great improvement over previous work at \u3e 70 ÎŒs. The readout speed-up in principle also reduces the production of errors due to excited state relaxation by a factor of ~ 10. All of these results are possible with relatively simple, low-power transistor circuits which can be mounted close to the qubit device at the mixing chamber stage of the dilution refrigerator

    Investigations of low-dimensional emitter system by dynamic strain platform

    Get PDF
    Two-dimensional transitional metal dichalcogenides (2D TMDs) and zero-dimensional quantum dots (QDs) are among the most representative low-dimensional emitter systems, with one or three dimensions on nano-scale. Both of them exhibit potential for (quantum) optical applications. Analog to the electric field and magnetic field, strain is a powerful probe to detect the physics of the emitter systems. The reduced dimension renders strain tuning more applicable to deepen the understanding and tune their properties. Previous researches demonstrate that strain can change the distance of particles or/and the symmetry. Based on this, we conduct some investigations: first, we detect the responses of monolayer WSe2 to biaxial in-plane strain. Generally, all the helicities of excitons and trions are related to the scattering process. In our observation, the decreases of exciton circular helicities in WSe2 and MoSe2 are associated with their e-h exchange interactions. The helicity of trion in MoSe2 is almost intact, and a phenomenological rate equation model is developed to describe the decrease of trions in WSe2, which agrees with our observation well. Our findings provide a new strategy to tune the read-in/read-out in TMDs-based memory devices. Second, we focus on the responses of WSe2 to uniaxial strain. We identify fine structures of neutral exciton in polarization-dependent photoluminescence spectroscopy. The nonlinear evolutions, in terms of amplitude and phase, with an active uniaxial strain are interpreted by the interaction of wavefunction with strain. Though these two bulk strain-tuning platforms hold the potential for sophisticated emitter systems, a more versatile strain-tuning platform is needed. In the last section of this work, a 2-leg MEMS strain-tuning platform is fabricated and then integrated with a QDs-embedded membrane. We resolve the position-dependent anisotropic strain on the strain-tuning platform and compare the opposite responses of positive and negative trions to the same strain. Our observation agrees well with the previous pseudo-potential/configuration interaction calculations. Notably, the 2-leg strain platform applies to 2D TMDs. These findings act as some helpful attempts to deepen the understanding of low-dimensional emitter systems. In some ongoing work, we get a prototype as a more versatile strain-tuning platform. We envision this platform can add a degree of freedom for the integrated photonic circuits

    Polarization-preserving quantum frequency conversion for trapped-atom based quantum networks

    Get PDF
    The scope of this thesis is the development of efficient and low-background polarization- preserving quantum frequency converters (PPQFC) and their integration into trapped-atom based quantum network nodes to demonstrate building blocks of a quantum network (QN). We constructed four PPQFC devices to transduce the emission wavelengths of single trapped 40Ca+-ions at 854 nm and neutral 87Rb-atoms at 780 nm to the low-loss telecom bands between 1260 nm and 1625 nm. Upon the conversion process, the quantum information encoded in the photon polarization has to be preserved. To this end, we rely on difference frequency generation in ridge waveguides, which are inserted into polarization interferometers arranged in Sagnac- or Mach-Zehnder-type configuration. For the conversion of single and entangled photons we achieved external device efficiencies between 26.5 % and 57.4 %, low background levels, which allow for signal-to-background ratios above 20, and process fidelities > 99.5 %. Employing the PPQFC devices, we were able to demonstrate several key elements of long-distance QNs: photon-photon entanglement over 40 km of fiber via 2-step QFC with a fidelity of 98.9 %, ion-telecom-photon entanglement with high fidelities up to 97.8 %, an atom-to-telecom-photon state transfer, and the distribution of atom-photon entanglement over 20 km of fiber with a fidelity of 78.9 %. These results hold great promise to extend small QNs with ≄ 2 nodes to a metropolitan scaleIn dieser Arbeit werden effiziente und hintergrundarme polarisationserhaltende Quan- tenfrequenzkonverter (PPQFC) entwickelt und in Quantennetzwerkknoten basierend auf gefangenen Atomen integriert, um Bausteine eines Quantennetzwerks (QN) zu demonstrieren. Wir haben vier PPQFC gebaut um die EmissionswellenlĂ€ngen von einzelnen 40Ca+-Ionen bei 854 nm und neutralen 87Rb-Atomen bei 780 nm in die verlustarmen TelekombĂ€nder zwischen 1260 nm und 1625 nm umzuwandeln. Im Konversionsprozess muss die Quanteninformation, kodiert in der Polarisation der Photonen, erhalten bleiben. Dazu nutzen wir Differenzfrequenzerzeugung in Kantenwellenleitern, welche in Polarisationsinterferometer in Form von Sagnac- oder Mach-Zehnder-Aufbauten integriert werden. FĂŒr die Konversion einzelner und verschrĂ€nkter Photonen erreichten wir externe GerĂ€teeffizienzen zwischen 26.5 % und 57.4 %, geringe HintergrundbeitrĂ€ge, die Signal-zu-Hintergrund-VerhĂ€ltnisse ĂŒber 20 ermöglichen, sowie Prozess-Fidelities > 99.5 %. Mit Hilfe der Konverter konnten wir eine Reihe von Kernelementen von langreichweitigen QNn zeigen: Photonen-Photonen-VerschrĂ€nkung ĂŒber 40 km Faser mittels 2-Schritt QFC mit einer Fidelity von 98.9 %, Ion-Telekom-Photon-VerschrĂ€nkung mit hohen Fidelities bis zu 97.8 %, einen Atom-zu-Telekom-Photon Zustandstransfer, und die Verteilung von Atom-Photon-VerschrĂ€nkung ĂŒber 20 km Faser mit einer Fidelity von 78.9 %. Diese Resultate sind vielversprechend um kleine QN mit ≄ 2 Knoten auf die LĂ€ngenskala einer Stadt auszuweiten

    Power-Performance Modeling and Adaptive Management of Heterogeneous Mobile Platforms​

    Get PDF
    abstract: Nearly 60% of the world population uses a mobile phone, which is typically powered by a system-on-chip (SoC). While the mobile platform capabilities range widely, responsiveness, long battery life and reliability are common design concerns that are crucial to remain competitive. Consequently, state-of-the-art mobile platforms have become highly heterogeneous by combining a powerful SoC with numerous other resources, including display, memory, power management IC, battery and wireless modems. Furthermore, the SoC itself is a heterogeneous resource that integrates many processing elements, such as CPU cores, GPU, video, image, and audio processors. Therefore, CPU cores do not dominate the platform power consumption under many application scenarios. Competitive performance requires higher operating frequency, and leads to larger power consumption. In turn, power consumption increases the junction and skin temperatures, which have adverse effects on the device reliability and user experience. As a result, allocating the power budget among the major platform resources and temperature control have become fundamental consideration for mobile platforms. Dynamic thermal and power management algorithms address this problem by putting a subset of the processing elements or shared resources to sleep states, or throttling their frequencies. However, an adhoc approach could easily cripple the performance, if it slows down the performance-critical processing element. Furthermore, mobile platforms run a wide range of applications with time varying workload characteristics, unlike early generations, which supported only limited functionality. As a result, there is a need for adaptive power and performance management approaches that consider the platform as a whole, rather than focusing on a subset. Towards this need, our specific contributions include (a) a framework to dynamically select the Pareto-optimal frequency and active cores for the heterogeneous CPUs, such as ARM big.Little architecture, (b) a dynamic power budgeting approach for allocating optimal power consumption to the CPU and GPU using performance sensitivity models for each PE, (c) an adaptive GPU frame time sensitivity prediction model to aid power management algorithms, and (d) an online learning algorithm that constructs adaptive run-time models for non-stationary workloads.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Tunable Silicon integrated photonics based on functional materials

    Get PDF
    This thesis is concerned with the design, fabrication, testing and development of tunable silicon photonic integrated circuits based on functional materials. This tunability is achieved by integrating liquid crystals, 2D materials and chalcogenide phase-change materials with silicon and silicon nitride integrated circuits. Switching the functional materials between their various states results in dramatic changes in the optical properties, with consequent changes in the optical response of the individual devices. Furthermore, such changes are volatile or non-volatile depending on the materials.Engineering and Physical Sciences Research Council (EPSRC

    Laser-assisted selective tuning of silicon nanophotonic structures

    Get PDF
    Silicon photonic integrated circuits have advanced to the point where thousands of components can now be combined into functioning optical circuits. A variety of quantum technologies are based upon the integrated silicon photonics platform, including pure photonics approaches as well as those based upon emerging silicon spin-photon interfaces. Integrated photonic components, such as grating couplers, photonic crystal cavities, and waveguides, are subject to slight manufacturing variations. For quantum technology applications, such variations often need to be minimized and ideally eliminated through careful post-processing. The laser-assisted "spot oxidation" post-processing technique is able to locally and permanently shift the resonance wavelength of nanophotonic devices using a 532nm continuous wave laser. While global tuning techniques affect entire chips, spot-oxidation is of interest because it can locally correct for specific manufacturing variations among many components within a single chip in an automated way. Yet prior to this work it was unclear if spot oxidation could be made compatible with photonic structures with SiO2 top-cladding, which are more robust and attractive for commercial deployment. Here, we apply laser-assisted tuning to silicon-on-insulator (SOI) devices with SiO2 top-cladding in the telecommunication O-band. In this work, we successfully tune both photonic crystal nanobeam cavities and sub-wavelength grating couplers up to 1.04(5)nm and 9(1)nm, respectively. This will enable higher-yield photonic circuits, as well as allow us to permanently locally tune optical structures into resonance with optically active colour centres in silicon
    • 

    corecore