15 research outputs found

    曲線の折れ線近似の収束の速さ

    Get PDF
    首都大学東京, 2016-03-25, 修士(理学)首都大学東

    A software shared virtual memory system with three way coherence protocols on the intel single-chip cloud computer

    No full text
    With the advancement of design and fabrication of high-performance integrated circuits technology, it is foreseeable that processors with more than 1,000 cores per die will appear in the near future. However, these many-core architectures have introduced a lot of challenges at the memory system level, such as complicated cache coherence and limited memory access speed, to name a few. This thesis focuses on one prominent many-core prototype — the Intel’s Single-chip Cloud Computer (SCC). The SCC architecture does not provide hardware cache coherency. Instead, it relies on on-chip programmable memory. The baseline coherence protocol for the SCC is the Software Managed Coherence (SMC) layer. To achieve memory consistency, it accesses shared memory without part of the typical cache hierarchy for efficient invalidation and flushing. We found that performance provided by this coherence layer in this manner is sub-optimal because accesses of shared memory would all turn into data update messages within the network mesh. As cache locality could not be exploited to its full potential, the execution pipelines stall much often for memory fetches from outside the chip. This research is to address the performance problem of shared virtual memory consistency for this cache in-coherent architecture. Oriented at sitting data on-chip as much as possible to reduce memory accesses external to the chip, we propose two techniques to leverage the cache hierarchy to full and reside data in the on-chip scratchpad memory. First, targeted at the architectural specificity of the hardware, we redesigned traditional software distributed shared memory (SDSM) to allow shared data be treated transparently like private memory so the cache hierarchy can be fully utilised without sacrificing memory consistency. Second, we propose a distance-aware page allocation scheme that samples access frequencies and select the most frequently-recently used pages to be stored on the on-chip scratchpad memory. Our experimental results show that our first technique, the ordinary SDSM outperforms the current SMC approach by 5 times. Moreover, in some cases, with the second technique that is based on scratchpad memory, our proposed system outperforms further by an additional 1.57 times. Our experiments also demonstrated that the SMC approach is not scalable due to congestion of the network mesh by coherence traffic generated while the two new approaches continued to scale well. The main contribution of this research is the implementation of a cache coherence software library system built for an architecture that comes with non-coherent cache hardware and just relies on software-defined cache. This new cache hierarchy has evidently opened the door for smarter and faster inter-processor-core data sharing without the need of complicated cache coherence hardware.published_or_final_versionComputer ScienceMasterMaster of Philosoph

    基于矩形凸块的盘式磁流变制动器设计与仿真

    No full text
    基于磁流变液的流变特性,设计了一款盘面加工有矩形凸块的盘式磁流变制动器,推导了其制动力矩计算公式,基于Matlab/simulink建立了仿真模型,不断优化矩形凸块的尺寸参数。仿真结果表明,当加工有12个宽度为2 mm、深度为1.5 mm的矩形凸块时,产生的制动力矩达到626 N·m,比未加工凸块时的制动力矩提高了31.5%,表明设计的磁流变制动器可以有效地提高制动力矩
    corecore