419 research outputs found

    University of Windsor Graduate Calendar 2023 Spring

    Get PDF
    https://scholar.uwindsor.ca/universitywindsorgraduatecalendars/1027/thumbnail.jp

    Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications

    Full text link
    The challenging deployment of compute-intensive applications from domains such Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces the community of computing systems to explore new design approaches. Approximate Computing appears as an emerging solution, allowing to tune the quality of results in the design of a system in order to improve the energy efficiency and/or performance. This radical paradigm shift has attracted interest from both academia and industry, resulting in significant research on approximation techniques and methodologies at different design layers (from system down to integrated circuits). Motivated by the wide appeal of Approximate Computing over the last 10 years, we conduct a two-part survey to cover key aspects (e.g., terminology and applications) and review the state-of-the art approximation techniques from all layers of the traditional computing stack. In Part II of our survey, we classify and present the technical details of application-specific and architectural approximation techniques, which both target the design of resource-efficient processors/accelerators & systems. Moreover, we present a detailed analysis of the application spectrum of Approximate Computing and discuss open challenges and future directions.Comment: Under Review at ACM Computing Survey

    University of Windsor Graduate Calendar 2023 Winter

    Get PDF
    https://scholar.uwindsor.ca/universitywindsorgraduatecalendars/1026/thumbnail.jp

    Examining the Relationships Between Distance Education Studentsā€™ Self-Efficacy and Their Achievement

    Get PDF
    This study aimed to examine the relationships between studentsā€™ self-efficacy (SSE) and studentsā€™ achievement (SA) in distance education. The instruments were administered to 100 undergraduate students in a distance university who work as migrant workers in Taiwan to gather data, while their SA scores were obtained from the university. The semi-structured interviews for 8 participants consisted of questions that showed the specific conditions of SSE and SA. The findings of this study were reported as follows: There was a significantly positive correlation between targeted SSE (overall scales and general self-efficacy) and SA. Targeted students' self-efficacy effectively predicted their achievement; besides, general self- efficacy had the most significant influence. In the qualitative findings, four themes were extracted for those students with lower self-efficacy but higher achievementā€”physical and emotional condition, teaching and learning strategy, positive social interaction, and intrinsic motivation. Moreover, three themes were extracted for those students with moderate or higher self-efficacy but lower achievementā€”more time for leisure (not hard-working), less social interaction, and external excuses. Providing effective learning environments, social interactions, and teaching and learning strategies are suggested in distance education

    On designing hardware accelerator-based systems: interfaces, taxes and benefits

    Full text link
    Complementary Metal Oxide Semiconductor (CMOS) Technology scaling has slowed down. One promising approach to sustain the historic performance improvement of computing systems is to utilize hardware accelerators. Today, many commercial computing systems integrate one or more accelerators, with each accelerator optimized to efficiently execute specific tasks. Over the years, there has been a substantial amount of research on designing hardware accelerators for machine learning (ML) training and inference tasks. Hardware accelerators are also widely employed to accelerate data privacy and security algorithms. In particular, there is currently a growing interest in the use of hardware accelerators for accelerating homomorphic encryption (HE) based privacy-preserving computing. While the use of hardware accelerators is promising, a realistic end-to-end evaluation of an accelerator when integrated into the full system often reveals that the benefits of an accelerator are not always as expected. Simply assessing the performance of the accelerated portion of an application, such as the inference kernel in ML applications, during performance analysis can be misleading. When designing an accelerator-based system, it is critical to evaluate the system as a whole and account for all the accelerator taxes. In the first part of our research, we highlight the need for a holistic, end-to-end analysis of workloads using ML and HE applications. Our evaluation of an ML application for a database management system (DBMS) shows that the benefits of offloading ML inference to accelerators depend on several factors, including backend hardware, model complexity, data size, and the level of integration between the ML inference pipeline and the DBMS. We also found that the end-to-end performance improvement is bottlenecked by data retrieval and pre-processing, as well as inference. Additionally, our evaluation of an HE video encryption application shows that while HE client-side operations, i.e., message-to- ciphertext and ciphertext-to-message conversion operations, are bottlenecked by number theoretic transform (NTT) operations, accelerating NTT in hardware alone is not sufficient to get enough application throughput (frame rate per second) improvement. We need to address all bottlenecks such as error sampling, encryption, and decryption in message-to-ciphertext and ciphertext-to-message conversion pipeline. In the second part of our research, we address the lack of a scalable evaluation infrastructure for building and evaluating accelerator-based systems. To solve this problem, we propose a robust and scalable software-hardware framework for accelerator evaluation, which uses an open-source RISC-V based System-on-Chip (SoC) design called BlackParrot. This framework can be utilized by accelerator designers and system architects to perform an end-to-end performance analysis of coherent and non-coherent accelerators while carefully accounting for the interaction between the accelerator and the rest of the system. In the third part of our research, we present RISE, which is a full RISC-V SoC designed to efficiently perform message-to-ciphertext and ciphertext-to-message conversion operations. RISE comprises of a BlackParrot core and an efficient custom-designed accelerator tailored to accelerate end-to-end message-to-ciphertext and ciphertext-to-message conversion operations. Our RTL-based evaluation demonstrates that RISE improves the throughput of the video encryption application by 10x-27x for different frame resolutions

    Homodyne spin noise spectroscopy and noise spectroscopy of a single quantum dot

    Get PDF
    The steady-state fluctuations of a spin system are closely interlinked with its dynamics in linear response to external perturbations. Spin noise spectroscopy exploits this link to extract parameters characterizing the dynamics without needing an intricate spin polarization scheme. In samples with an accessible optical resonance, the spin fluctuations are imprinted onto a transmitted linearly polarized quasi-resonant probe laser beam according to the optical selection rules, making an all-optical observation of spin dynamics possible. The beamā€™s detuning and intensity determine whether the system is probed at thermal equilibrium or under optical driving. The technique is uniquely applicable for studying single quantum dots, where a charge carrierā€™s spin and occupancy dynamics can be observed simultaneously. This thesis presents a step-by-step derivation of the shape and statistical properties of experimental spectra and highlights the experimental limitations faced by the technique at very low probe intensities through uncorrelated broadband technical noise contributions. Optical homodyne amplification is evaluated in a proof-of-principle experiment to determine whether this limitation can be overcome at low frequencies < 5 MHz. Unlike previous attempts, the presented proof-of-principle experiment demonstrates that shot-noise limited spin noise measurements are possible in low-frequency ranges down to ā‰³ 100 kHz. For even lower frequencies, the suppression of laser intensity noise by the limited common-mode rejection of conventional balanced detectors is found to be the limiting contribution. In the second part of the thesis, optical spin noise spectroscopy is used to conduct a long-term study of spin and occupancy dynamics of an individual hole spin confined in an (In,Ga)As quantum dot with high radial symmetry in the high magnetic fields regime. For magnetic fields ā‰³ 250 mT, the splitting of the Zeeman branches with an effective g-factor of 2.159(2) exceeds the quantum dotā€™s trion resonanceā€™s homogeneous line width of 6.3(2) Ī¼eV, revealing a rich spectral structure of spin and occupancy dynamics. This structure reveals a so far neglected contribution of an internal photoeffect to the charge dynamics between the quantum dot and its environment. Previously developed theoretical modeling is extended to incorporate the photoeffect and successfully achieves excellent qualitative correspondence with experimental spectra for almost all detuning ranges. The photoeffect shuffles the charge from and into the quantum dot with two distinct rates. Within the model, the previously required Auger process is unnecessary to describe the experimental data. The rates of discharging and recharging the quantum dot are determined to be on the order of 12(7) kHzĀ·Ī¼mĀ²Ā·nWā»Ā¹ and 6(2) kHzĀ·Ī¼mĀ²Ā·nWā»Ā¹, respectively. For magnetic fields < 500 mT, very long T1 hole spin relaxation times ā‰« 1 ms are observed, while above 500 mT, T1 falls to 5(2) Ī¼s at 2.5 T, qualitatively confirming the theoretical prediction of a single-phonon mediated relaxation process. Furthermore, the electron spin relaxation time T1 in the trion state shows no pronounced dependence on magnetic fields above 500 mT and stays at a constant value of 101(2) ns. The saturation intensity of the transition also does not depend on the magnetic field and stays at a constant value of 4.8(7) nWĀ·Ī¼mā»Ā²

    Memory-Based FFT Architecture with Optimized Number of Multiplexers and Memory Usage

    Get PDF
    This brief presents a new P-parallel radix-2 memory-based fast Fourier transform (FFT) architecture. The aim of this work is to reduce the number of multiplexers and achieve an efficient memory usage. One advantage of the proposed architecture is that it only needs permutation circuits after the memories, which reduces the multiplexer usage to only one multiplexer per parallel branch. Another advantage is that the architecture calculates the same permutation based on the perfect shuffle at each iteration. Thus, the shuffling circuits do not need to be configured for different iterations. In fact, all the memories require the same read and write addresses, which simplifies the control even further and allows to merge the memories. Along with the hardware efficiency, conflict-free memory access is fulfilled by a circular counter. The FFT has been implemented on a field programmable gate array. Compared to previous approaches, the proposed architecture has the least number of multiplexers and achieves very low area usage.publishedVersionPeer reviewe

    Approximate Computing for Energy Efficiency

    Get PDF
    • ā€¦
    corecore