2,140 research outputs found

    Cross-talk-free dual-color fluorescence cross-correlation spectroscopy for high-throughtput screening

    Get PDF
    High throughput processing of chemical/biochemical information is critical in many areas, such as genome sequencing, drug discovery and clinical diagnostics. Integral to collecting information at high rates with the necessary throughput is the development of systems that can not only monitor the results with high precision and accuracy, but also prepare and process samples prior to the analytical measurement. To achieve the required throughput, we have conducted work directed toward developing a system that provides detection sensitivity at the single-molecule detection (SMD) level. The research was first focused on the development of a sensitive strategy for the detection of proteins (thrombin) at the SMD level. Nucleic acid-based fluorescence sensors were used as recognition elements for the detection of single protein molecules with single-pair fluorescence resonance energy transfer. The technique provided higher analytical sensitivity compared to bulk analog measurements due to the digital readout format (i.e., molecular counting) and also reduced assay turn-around-time. Research was then directed toward the design and construction of a two-color FCCS system, which employed two spectrally separate fluorophores, Cy3 (λabs = 532 nm, λem= 560 nm) and IRD800 (λabs = 780 nm, λem= 810 nm). The system provided negligible color cross-talk (cross-excitation and/or cross-emission) and/or fluorescence resonance energy transfer (FRET). To provide evidence of cross-talk free operation, the system was evaluated using microspheres and quantum dots. Experimental results indicated no color leakage from the microspheres or quantum dots into inappropriate color channels. The enzymatic activity of APE1 was monitored by FCCS using a double-stranded DNA substrate that was dual labeled with Cy3 and IRD800. Activity of APE1 was also monitored in the presence of an inhibitor (7-nitroindole-2-carboxylic acid). To improve sample processing throughput, a multi-phase (water-in-oil) segmented flow microfluidic chip was studied using the FCCS system to monitor APE1 enzyme activity. Aqueous droplets were generated in a perfluorocarbon (FC-3283) carrier fluid with a nonionic surfactant (Perfluorooctanol, 10% v/v) in a polymer microchip. The optical system successfully monitored the controlled generation of highly regular droplets loaded with fluorescent beads at delivery rates ranging from 40 - 60 droplets per sec

    Spectrum Sharing Methods in Coexisting Wireless Networks

    Get PDF
    Radio spectrum, the fundamental basis for wireless communication, is a finite resource. The development of the expanding range of radio based devices and services in recent years makes the spectrum scarce and hence more costly under the paradigm of extensive regulation for licensing. However, with mature technologies and with their continuous improvements it becomes apparent that tight licensing might no longer be required for all wireless services. This is from where the concept of utilizing the unlicensed bands for wireless communication originates. As a promising step to reduce the substantial cost for radio spectrum, different wireless technology based networks are being deployed to operate in the same spectrum bands, particularly in the unlicensed bands, resulting in coexistence. However, uncoordinated coexistence often leads to cases where collocated wireless systems experience heavy mutual interference. Hence, the development of spectrum sharing rules to mitigate the interference among wireless systems is a significant challenge considering the uncoordinated, heterogeneous systems. The requirement of spectrum sharing rules is tremendously increasing on the one hand to fulfill the current and future demand for wireless communication by the users, and on the other hand, to utilize the spectrum efficiently. In this thesis, contributions are provided towards dynamic and cognitive spectrum sharing with focus on the medium access control (MAC) layer, for uncoordinated scenarios of homogeneous and heterogeneous wireless networks, in a micro scale level, highlighting the QoS support for the applications. This thesis proposes a generic and novel spectrum sharing method based on a hypothesis: The regular channel occupation by one system can support other systems to predict the spectrum opportunities reliably. These opportunities then can be utilized efficiently, resulting in a fair spectrum sharing as well as an improving aggregated performance compared to the case without having special treatment. The developed method, denoted as Regular Channel Access (RCA), is modeled for systems specified by the wireless local resp. metropolitan area network standards IEEE 802.11 resp. 802.16. In the modeling, both systems are explored according to their respective centrally controlled channel access mechanisms and the adapted models are evaluated through simulation and results analysis. The conceptual model of spectrum sharing based on the distributed channel access mechanism of the IEEE 802.11 system is provided as well. To make the RCA method adaptive, the following enabling techniques are developed and integrated in the design: a RSS-based (Received Signal Strength based) detection method for measuring the channel occupation, a pattern recognition based algorithm for system identification, statistical knowledge based estimation for traffic demand estimation and an inference engine for reconfiguration of resource allocation as a response to traffic dynamics. The advantage of the RCA method is demonstrated, in which each competing collocated system is configured to have a resource allocation based on the estimated traffic demand of the systems. The simulation and the analysis of the results show a significant improvement in aggregated throughput, mean delay and packet loss ratio, compared to the case where legacy wireless systems coexists. The results from adaptive RCA show its resilience characteristics in case of dynamic traffic. The maximum achievable throughput between collocated IEEE 802.11 systems applying RCA is provided by means of mathematical calculation. The results of this thesis provide the basis for the development of resource allocation methods for future wireless networks particularly emphasized to operate in current unlicensed bands and in future models of the Open Spectrum Alliance

    Quality of Service optimisation framework for Next Generation Networks

    Get PDF
    Within recent years, the concept of Next Generation Networks (NGN) has become widely accepted within the telecommunication area, in parallel with the migration of telecommunication networks from traditional circuit-switched technologies such as ISDN (Integrated Services Digital Network) towards packet-switched NGN. In this context, SIP (Session Initiation Protocol), originally developed for Internet use only, has emerged as the major signalling protocol for multimedia sessions in IP (Internet Protocol) based NGN. One of the traditional limitations of IP when faced with the challenges of real-time communications is the lack of quality support at the network layer. In line with NGN specification work, international standardisation bodies have defined a sophisticated QoS (Quality of Service) architecture for NGN, controlling IP transport resources and conventional IP QoS mechanisms through centralised higher layer network elements via cross-layer signalling. Being able to centrally control QoS conditions for any media session in NGN without the imperative of a cross-layer approach would result in a feasible and less complex NGN architecture. Especially the demand for additional network elements would be decreased, resulting in the reduction of system and operational costs in both, service and transport infrastructure. This thesis proposes a novel framework for QoS optimisation for media sessions in SIP-based NGN without the need for cross-layer signalling. One key contribution of the framework is the approach to identify and logically group media sessions that encounter similar QoS conditions, which is performed by applying pattern recognition and clustering techniques. Based on this novel methodology, the framework provides functions and mechanisms for comprehensive resource-saving QoS estimation, adaptation of QoS conditions, and support of Call Admission Control. The framework can be integrated with any arbitrary SIP-IP-based real-time communication infrastructure, since it does not require access to any particular QoS control or monitoring functionalities provided within the IP transport network. The proposed framework concept has been deployed and validated in a prototypical simulation environment. Simulation results show MOS (Mean Opinion Score) improvement rates between 53 and 66 percent without any active control of transport network resources. Overall, the proposed framework comes as an effective concept for central controlled QoS optimisation in NGN without the need for cross-layer signalling. As such, by either being run stand-alone or combined with conventional QoS control mechanisms, the framework provides a comprehensive basis for both the reduction of complexity and mitigation of issues coming along with QoS provision in NGN

    Data Acquistion for Germanium-Detector Arrays

    Get PDF

    Data acquisition for Germanium-detector arrays

    Get PDF
    Die Wandlung von analogen zu digitalen Signalen und die anschließende online/offline Verarbeitung ist die technologische Voraussetzung zahlreicher Experimente. Für diese Aufgaben werden häufig sogenannte Analog-Digital-Wandler (ADC) und FPGAs („field-programmable gate array“) eingesetzt. Die vorliegende Arbeit beschreibt die Evaluierung der FPGA und ADC Komponenten für die geplante FlashCAM 2.0 DAQ (FC2.0 DAQ). Die Entwicklung der ersten FlashCAM (1.0) DAQ (FC1.0 DAQ) wurde unter Federführung des Max-Planck-Instituts für Kernphysik im Jahre 2012 begonnen und war ursprünglich eine exklusive Entwicklung für das Cherenkov Telescope Array (CTA) Experiment. In der Zwischenzeit wird FlashCAM in zahlreichen Experimenten (HESS, HAWK, LEGEND-200, etc.) eingesetzt, die sowohl Photomultiplier (PMTs) als auch High Purity Germanium (HPGe) Detektoren umfassen. Beide Detektorentypen unterscheiden sich massiv in ihren Anforderungen und können auch von der neuen DAQ abgedeckt werden. Das Themengebiert der Arbeit umfasst den gesamten funktionellen Umfang einer modernen DAQ. Moderne DAQ Systeme benötigen eine möglichst hohe Read Out Performance zwischen dem DAQ Board und dem es kontrollierenden Server. Die Umsetzung eines leistungsfähigen Firmware Designs und das Design einer hierauf angepassten Hardware/Softwareschnittstelle wird am Beispiel der Zynq Familie vorgestellt. Die Zynq-Familie von Xilinx ist von besonderem Interesse, da der Hardwarehersteller Trenz Elektronik ein flexibles, einfach aufsteckbares Modulkonzept mit verschiedenen SoCs der Zynq-Serie anbietet. Neben der Read Out Performance einer DAQ ist ihre Auflösungsgrenze von entscheidender Bedeutung für das Gelingen des finalen Experiments. Die verwendete FADC Karte muss sich daher durch exzellente SNR und Linearitätseigenschaften auszeichnen. Die Evaluierung solcher FADC Karten setzt ein Testsetup voraus, dass in Signalreinheit und Stabilität die hohen Anforderungen der devices under test übertreffen muss. Praktisch sind diese Bedingungen nur unter hohem (Kosten) Aufwand erreichbar. Im Rahmen der Arbeit wurden daher auch alternative Testkonzepte entwickelt, die mit akzeptablen Abstrichen in der Genauigkeit eine Messung im experimentellen Umfeld ermöglichen können. Da sich die Themengebiete in ihrem Inhalt deutlich unterscheiden, wurde die vorliegende Arbeit in zwei Themenkomplexe aufgeteilt. Der erste Teil der Arbeit beschäftigt sich mit dem Einsatz der Zynq Familie in der geplanten „FlashCAM“ Nachfolger DAQ. Der zweite Teil widmet sich der ADC Nichtlinearitätsbestimmung. Die wichtigsten Ergebnisse der Arbeit lassen sich folgt zusammenfassen: ▪ Die „High Performance“ (HP) Schnittstellen der Zynq-UltraScale+ haben eine aussetzerfreie Bandbreite von 2.4 GB/s in den externen Arbeitsspeicher der Trenz Module. Wird noch zusätzlich die standardmäßig vorhandene 1 Gb PS-Ethernet Verbindung betrieben, verbleibt der CPU noch eine Bandbreite von mindestens 0.5 GB/s in den Arbeitsspeicher. Im Fall der Zynq-7000 Serie ist eine effiziente Implementierung der HP Schnittstellen schwierig, da die CPU nur vergleichsweise niedrige Arbeitsspeicherzugriffsraten erreicht. Die HP Schnittstellen sind eine wichtige Designalternative da ein durchgehender Datentransfer in den externen Arbeitsspeicher ein Design ermöglichen würde dass weniger stark durch den verfügbaren FPGA internen Speicher begrenzt ist. Dies wäre besonders für Anwendungen in der HPGe-Spektroskopie wünschenswert, da der praktische Nutzen des verwendeten Designs stark von der zur Verfügung stehende Puffergröße abhängt. ▪ Die “Accelerator Coherency” Schnittstelle (ACP) ermöglicht ein direkter Datentransfer aus der FPGA in den Cache der Zynq-CPU. Die entworfene ACP-CMA hat eine Bandweite von bis zu 2.4 GB/s und bietet für Cache-CPU Zugriffe noch ausreichend Reserve. Dass die Zynq-CPU die Cachedaten ohne ein Abwürgen der ACP-CMA verarbeiten kann, ist entscheidend. Wäre dies nicht der Fall könnte die CPU im Parallelbetrieb von Ethernet und ACP-CMA nicht die notwendigen Vorarbeiten zur Ethernet-Übertragung („Event Building“) bewältigen. In der Evaluierung wurde eine maximale Event Building Bandbreite von 0.7 GB/s festgestellt. Wahrscheinlich ist die reale maximale Bandbreite deutlich höher anzusiedeln. Einschränkend muss betont werden, dass in praktischen Applikationen zusätzliche Einschränkungen in Kraft treten, die de-facto einen kontinuierlichen Betrieb der ACP-CMA unmöglich machen. Diese Einschränkungen – die nicht prinzipieller Natur sind - wurden in der durchgeführten Ermittlung nicht berücksichtigt. Da weiterhin alle Zynq-FPGAs über einen Cache verfügen, ist die ACP-CMA eine Designlösung, die auf allen verfügbaren Zynq-FPGAs sinnvoll implementiert werden kann. Dies unterscheidet sie von der entwickelten HP-DMA, die häufig nur für Implementierungen in einer Zynq-UltraScale FPGA interessant ist. ▪ Der neuentwickelte FC2.0 Prototype wurde bereits in experimentellen Setups eingesetzt. Als Anwendungsbeispiel dient die Messung und Analyse eines γ-ray Spektrums eines HPGe-Detektors. ▪ Der Erfolg einer ADC Nichtlinearitätsbestimmungen ist stark von der Signalreinheit des verwendeten Eingangssignal abhängig. In Simulationen konnte gezeigt werden, dass die neu entwickelten Verfahren nur relativ schwach durch Pulsernichtlinearitäten verfälscht werden. Einen praktischen Vergleich zwischen den neuen und einer klassischen Methode konnte keinen signifikanten Unterschied feststellen. Die untersuchten Methoden können daher für eine zukünftige Implementation in FC2.0 empfohlen werden

    Mobile Ad-Hoc Networks

    Get PDF
    Being infrastructure-less and without central administration control, wireless ad-hoc networking is playing a more and more important role in extending the coverage of traditional wireless infrastructure (cellular networks, wireless LAN, etc). This book includes state-of-the-art techniques and solutions for wireless ad-hoc networks. It focuses on the following topics in ad-hoc networks: quality-of-service and video communication, routing protocol and cross-layer design. A few interesting problems about security and delay-tolerant networks are also discussed. This book is targeted to provide network engineers and researchers with design guidelines for large scale wireless ad hoc networks

    Crowd simulation and visualization

    Get PDF
    Large-scale simulation and visualization are essential topics in areas as different as sociology, physics, urbanism, training, entertainment among others. This kind of systems requires a vast computational power and memory resources commonly available in High Performance Computing HPC platforms. Currently, the most potent clusters have heterogeneous architectures with hundreds of thousands and even millions of cores. The industry trends inferred that exascale clusters would have thousands of millions. The technical challenges for simulation and visualization process in the exascale era are intertwined with difficulties in other areas of research, including storage, communication, programming models and hardware. For this reason, it is necessary prototyping, testing, and deployment a variety of approaches to address the technical challenges identified and evaluate the advantages and disadvantages of each proposed solution. The focus of this research is interactive large-scale crowd simulation and visualization. To exploit to the maximum the capacity of the current HPC infrastructure and be prepared to take advantage of the next generation. The project develops a new approach to scale crowd simulation and visualization on heterogeneous computing cluster using a task-based technique. Its main characteristic is hardware agnostic. It abstracts the difficulties that imply the use of heterogeneous architectures like memory management, scheduling, communications, and synchronization — facilitating development, maintenance, and scalability. With the goal of flexibility and take advantage of computing resources as best as possible, the project explores different configurations to connect the simulation with the visualization engine. This kind of system has an essential use in emergencies. Therefore, urban scenes were implemented as realistic as possible; in this way, users will be ready to face real events. Path planning for large-scale crowds is a challenge to solve, due to the inherent dynamism in the scenes and vast search space. A new path-finding algorithm was developed. It has a hierarchical approach which offers different advantages: it divides the search space reducing the problem complexity, it can obtain a partial path instead of wait for the complete one, which allows a character to start moving and compute the rest asynchronously. It can reprocess only a part if necessary with different levels of abstraction. A case study is presented for a crowd simulation in urban scenarios. Geolocated data are used, they were produced by mobile devices to predict individual and crowd behavior and detect abnormal situations in the presence of specific events. It was also address the challenge of combining all these individual’s location with a 3D rendering of the urban environment. The data processing and simulation approach are computationally expensive and time-critical, it relies thus on a hybrid Cloud-HPC architecture to produce an efficient solution. Within the project, new models of behavior based on data analytics were developed. It was developed the infrastructure to be able to consult various data sources such as social networks, government agencies or transport companies such as Uber. Every time there is more geolocation data available and better computation resources which allow performing analysis of greater depth, this lays the foundations to improve the simulation models of current crowds. The use of simulations and their visualization allows to observe and organize the crowds in real time. The analysis before, during and after daily mass events can reduce the risks and associated logistics costs.La simulación y visualización a gran escala son temas esenciales en áreas tan diferentes como la sociología, la física, el urbanismo, la capacitación, el entretenimiento, entre otros. Este tipo de sistemas requiere una gran capacidad de cómputo y recursos de memoria comúnmente disponibles en las plataformas de computo de alto rendimiento. Actualmente, los equipos más potentes tienen arquitecturas heterogéneas con cientos de miles e incluso millones de núcleos. Las tendencias de la industria infieren que los equipos en la era exascale tendran miles de millones. Los desafíos técnicos en el proceso de simulación y visualización en la era exascale se entrelazan con dificultades en otras áreas de investigación, incluidos almacenamiento, comunicación, modelos de programación y hardware. Por esta razón, es necesario crear prototipos, probar y desplegar una variedad de enfoques para abordar los desafíos técnicos identificados y evaluar las ventajas y desventajas de cada solución propuesta. El foco de esta investigación es la visualización y simulación interactiva de multitudes a gran escala. Aprovechar al máximo la capacidad de la infraestructura actual y estar preparado para aprovechar la próxima generación. El proyecto desarrolla un nuevo enfoque para escalar la simulación y visualización de multitudes en un clúster de computo heterogéneo utilizando una técnica basada en tareas. Su principal característica es que es hardware agnóstico. Abstrae las dificultades que implican el uso de arquitecturas heterogéneas como la administración de memoria, las comunicaciones y la sincronización, lo que facilita el desarrollo, el mantenimiento y la escalabilidad. Con el objetivo de flexibilizar y aprovechar los recursos informáticos lo mejor posible, el proyecto explora diferentes configuraciones para conectar la simulación con el motor de visualización. Este tipo de sistemas tienen un uso esencial en emergencias. Por lo tanto, se implementaron escenas urbanas lo más realistas posible, de esta manera los usuarios estarán listos para enfrentar eventos reales. La planificación de caminos para multitudes a gran escala es un desafío a resolver, debido al dinamismo inherente en las escenas y el vasto espacio de búsqueda. Se desarrolló un nuevo algoritmo de búsqueda de caminos. Tiene un enfoque jerárquico que ofrece diferentes ventajas: divide el espacio de búsqueda reduciendo la complejidad del problema, puede obtener una ruta parcial en lugar de esperar a la completa, lo que permite que un personaje comience a moverse y calcule el resto de forma asíncrona, puede reprocesar solo una parte si es necesario con diferentes niveles de abstracción. Se presenta un caso de estudio para una simulación de multitud en escenarios urbanos. Se utilizan datos geolocalizados producidos por dispositivos móviles para predecir el comportamiento individual y público y detectar situaciones anormales en presencia de eventos específicos. También se aborda el desafío de combinar la ubicación de todos estos individuos con una representación 3D del entorno urbano. Dentro del proyecto, se desarrollaron nuevos modelos de comportamiento basados ¿¿en el análisis de datos. Se creo la infraestructura para poder consultar varias fuentes de datos como redes sociales, agencias gubernamentales o empresas de transporte como Uber. Cada vez hay más datos de geolocalización disponibles y mejores recursos de cómputo que permiten realizar un análisis de mayor profundidad, esto sienta las bases para mejorar los modelos de simulación de las multitudes actuales. El uso de simulaciones y su visualización permite observar y organizar las multitudes en tiempo real. El análisis antes, durante y después de eventos multitudinarios diarios puede reducir los riesgos y los costos logísticos asociadosPostprint (published version

    상변화 메모리 시스템의 간섭 오류 완화 및 RMW 성능 향상 기법

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2021.8. 이혁재.Phase-change memory (PCM) announces the beginning of the new era of memory systems, owing to attractive characteristics. Many memory product manufacturers (e.g., Intel, SK Hynix, and Samsung) are developing related products. PCM can be applied to various circumstances; it is not simply limited to an extra-scale database. For example, PCM has a low standby power due to its non-volatility; hence, computation-intensive applications or mobile applications (i.e., long memory idle time) are suitable to run on PCM-based computing systems. Despite these fascinating features of PCM, PCM is still far from the general commercial market due to low reliability and long latency problems. In particular, low reliability is a painful problem for PCM in past decades. As the semiconductor process technology rapidly scales down over the years, DRAM reaches 10 nm class process technology. In addition, it is reported that the write disturbance error (WDE) would be a serious issue for PCM if it scales down below 54 nm class process technology. Therefore, addressing the problem of WDEs becomes essential to make PCM competitive to DRAM. To overcome this problem, this dissertation proposes a novel approach that can restore meta-stable cells on demand by levering two-level SRAM-based tables, thereby significantly reducing the number WDEs. Furthermore, a novel randomized approach is proposed to implement a replacement policy that originally requires hundreds of read ports on SRAM. The second problem of PCM is a long-latency compared to that of DRAM. In particular, PCM tries to enhance its throughput by adopting a larger transaction unit; however, the different unit size from the general-purpose processor cache line further degrades the system performance due to the introduction of a read-modify-write (RMW) module. Since there has never been any research related to RMW in a PCM-based memory system, this dissertation proposes a novel architecture to enhance the overall system performance and reliability of a PCM-based memory system having an RMW module. The proposed architecture enhances data re-usability without introducing extra storage resources. Furthermore, a novel operation that merges commands regardless of command types is proposed to enhance performance notably. Another problem is the absence of a full simulation platform for PCM. While the announced features of the PCM-related product (i.e., Intel Optane) are scarce due to confidential issues, all priceless information can be integrated to develop an architecture simulator that resembles the available product. To this end, this dissertation tries to scrape up all available features of modules in a PCM controller and implement a dedicated simulator for future research purposes.상변화 메모리는(PCM) 매력적인 특성을 통해 메모리 시스템의 새로운 시대의 시작을 알렸다. 많은 메모리 관련 제품 제조업체(예 : 인텔, SK 하이닉스, 삼성)가 관련 제품 개발에 박차를 가하고 있다. PCM은 단순히 대규모 데이터베이스에만 국한되지 않고 다양한 상황에 적용될 수 있다. 예를 들어, PCM은 비휘발성으로 인해 대기 전력이 낮다. 따라서 계산 집약적인 애플리케이션 또는 모바일 애플리케이션은(즉, 긴 메모리 유휴 시간) PCM 기반 컴퓨팅 시스템에서 실행하기에 적합하다. PCM의 이러한 매력적인 특성에도 불구하고 PCM은 낮은 신뢰성과 긴 대기 시간으로 인해 여전히 일반 산업 시장에서는 DRAM과 다소 격차가 있다. 특히 낮은 신뢰성은 지난 수십 년 동안 PCM 기술의 발전을 저해하는 문제다. 반도체 공정 기술이 수년에 걸쳐 빠르게 축소됨에 따라 DRAM은 10nm 급 공정 기술에 도달하였다. 이어서, 쓰기 방해 오류 (WDE)가 54nm 등급 프로세스 기술 아래로 축소되면 PCM에 심각한 문제가 될 것으로 보고되었다. 따라서, WDE 문제를 해결하는 것은 PCM이 DRAM과 동등한 경쟁력을 갖추도록 하는 데 있어 필수적이다. 이 문제를 극복하기 위해 이 논문에서는 2-레벨 SRAM 기반 테이블을 활용하여 WDE 수를 크게 줄여 필요에 따라 준 안정 셀을 복원할 수 있는 새로운 접근 방식을 제안한다. 또한, 원래 SRAM에서 수백 개의 읽기 포트가 필요한 대체 정책을 구현하기 위해 새로운 랜덤 기반의 기법을 제안한다. PCM의 두 번째 문제는 DRAM에 비해 지연 시간이 길다는 것이다. 특히 PCM은 더 큰 트랜잭션 단위를 채택하여 단위시간 당 데이터 처리량 향상을 도모한다. 그러나 범용 프로세서 캐시 라인과 다른 유닛 크기는 읽기-수정-쓰기 (RMW) 모듈의 도입으로 인해 시스템 성능을 저하하게 된다. PCM 기반 메모리 시스템에서 RMW 관련 연구가 없었기 때문에 본 논문은 RMW 모듈을 탑재 한 PCM 기반 메모리 시스템의 전반적인 시스템 성능과 신뢰성을 향상하게 시킬 수 있는 새로운 아키텍처를 제안한다. 제안된 아키텍처는 추가 스토리지 리소스를 도입하지 않고도 데이터 재사용성을 향상시킨다. 또한, 성능 향상을 위해 명령 유형과 관계없이 명령을 병합하는 새로운 작업을 제안한다. 또 다른 문제는 PCM을 위한 완전한 시뮬레이션 플랫폼이 부재하다는 것이다. PCM 관련 제품(예 : Intel Optane)에 대해 발표된 정보는 대외비 문제로 인해 부족하다. 하지만 알려져 있는 정보를 적절히 취합하면 시중 제품과 유사한 아키텍처 시뮬레이터를 개발할 수 있다. 이를 위해 본 논문은 PCM 메모리 컨트롤러에 필요한 모든 모듈 정보를 활용하여 향후 이와 관련된 연구에서 충분히 사용 가능한 전용 시뮬레이터를 구현하였다.1 INTRODUCTION 1 1.1 Limitation of Traditional Main Memory Systems 1 1.2 Phase-Change Memory as Main Memory 3 1.2.1 Opportunities of PCM-based System 3 1.2.2 Challenges of PCM-based System 4 1.3 Dissertation Overview 7 2 BACKGROUND AND PREVIOUS WORK 8 2.1 Phase-Change Memory 8 2.2 Mitigation Schemes for Write Disturbance Errors 10 2.2.1 Write Disturbance Errors 10 2.2.2 Verification and Correction 12 2.2.3 Lazy Correction 13 2.2.4 Data Encoding-based Schemes 14 2.2.5 Sparse-Insertion Write Cache 16 2.3 Performance Enhancement for Read-Modify-Write 17 2.3.1 Traditional Read-Modify-Write 17 2.3.2 Write Coalescing for RMW 19 2.4 Architecture Simulators for PCM 21 2.4.1 NVMain 21 2.4.2 Ramulator 22 2.4.3 DRAMsim3 22 3 IN-MODULE DISTURBANCE BARRIER 24 3.1 Motivation 25 3.2 IMDB: In Module-Disturbance Barrier 29 3.2.1 Architectural Overview 29 3.2.2 Implementation of Data Structures 30 3.2.3 Modification of Media Controller 36 3.3 Replacement Policy 38 3.3.1 Replacement Policy for IMDB 38 3.3.2 Approximate Lowest Number Estimator 40 3.4 Putting All Together: Case Studies 43 3.5 Evaluation 45 3.5.1 Configuration 45 3.5.2 Architectural Exploration 47 3.5.3 Effectiveness of the Replacement Policy 48 3.5.4 Sensitivity to Main Table Configuration 49 3.5.5 Sensitivity to Barrier Buffer Size 51 3.5.6 Sensitivity to AppLE Group Size 52 3.5.7 Comparison with Other Studies 54 3.6 Discussion 59 3.7 Summary 63 4 INTEGRATION OF AN RMW MODULE IN A PCM-BASED SYSTEM 64 4.1 Motivation 65 4.2 Utilization of DRAM Cache for RMW 67 4.2.1 Architectural Design 67 4.2.2 Algorithm 70 4.3 Typeless Command Merging 73 4.3.1 Architectural Design 73 4.3.2 Algorithm 74 4.4 An Alternative Implementation: SRC-RMW 78 4.4.1 Implementation of SRC-RMW 78 4.4.2 Design Constraint 80 4.5 Case Study 82 4.6 Evaluation 85 4.6.1 Configuration 85 4.6.2 Speedup 88 4.6.3 Read Reliability 91 4.6.4 Energy Consumption: Selecting a Proper Page Size 93 4.6.5 Comparison with Other Studies 95 4.7 Discussion 97 4.8 Summary 99 5 AN ALL-INCLUSIVE SIMULATOR FOR A PCM CONTROLLER 100 5.1 Motivation 101 5.2 PCMCsim: PCM Controller Simulator 103 5.2.1 Architectural Overview 103 5.2.2 Underlying Classes of PCMCsim 104 5.2.3 Implementation of Contention Behavior 108 5.2.4 Modules of PCMCsim 109 5.3 Evaluation 116 5.3.1 Correctness of the Simulator 116 5.3.2 Comparison with Other Simulators 117 5.4 Summary 119 6 Conclusion 120 Abstract (In Korean) 141 Acknowledgment 143박

    Cross-layer optimisation of quality of experience for video traffic

    Get PDF
    Realtime video traffic is currently the dominant network traffic and is set to increase in volume for the foreseeable future. As this traffic is bursty, providing perceptually good video quality is a challenging task. Bursty traffic refers to inconsistency of the video traffic level. It is at high level sometimes while is at low level at some other times. Many video traffic measurement algorithms have been proposed for measurement-based admission control. Despite all of this effort, there is no entirely satisfactory admission algorithm for variable rate flows. Furthermore, video frames are subjected to loss and delay which cause quality degradation when sent without reacting to network congestion. The perceived Quality of Experience (QoE)-number of sessions trade-off can be optimised by exploiting the bursty nature of video traffic. This study introduces a cross-layer QoE-aware optimisation architecture for video traffic. QoE is a measure of the user's perception of the quality of a network service. The architecture addresses the problem of QoE degradation in a bottleneck network. It proposes that video sources at the application layer adapt their rate to the network environment by dynamically controlling their transmitted bit rate. Whereas the edge of the network protects the quality of active video sessions by controlling the acceptance of new sessions through a QoE-aware admission control. In particular, it seeks the most efficient way of accepting new video sessions and adapts sending rates to free up resources for more sessions whilst maintaining the QoE of the current sessions. As a pathway to the objective, the performance of the video flows that react to the network load by adapting the sending rate was investigated. Although dynamic rate adaptation enhances the video quality, accepting more sessions than a link can accommodate will degrade the QoE. The video's instantaneous aggregate rate was compared to the average aggregate rate which is a calculated rate over a measurement time window. It was found that there is no substantial difference between the two rates except for a small number of video flows, long measurement window, or fast moving contents (such as sport), in which the average is smaller than the instantaneous rate. These scenarios do not always represent the reality. The finding discussed above was the main motivation for proposing a novel video traffic measurement algorithm that is QoE-aware. The algorithm finds the upper limit of the video total rate that can exceed a specific link capacity without the QoE degradation of ongoing video sessions. When implemented in a QoE-aware admission control, the algorithm managed to maintain the QoE for a higher number of video session compared to the calculated rate-based admission controls such as the Internet Engineering Task Force (IETF) standard Pre-Congestion Notification (PCN)-based admission control. Subjective tests were conducted to involve human subjects in rating of the quality of videos delivered with the proposed measurement algorithm. Mechanisms proposed for optimising the QoE of video traffic were surveyed in detail in this dissertation and the challenges of achieving this objective were discussed. Finally, the current rate adaptation capability of video applications was combined with the proposed QoE-aware admission control in a QoE-aware cross-layer architecture. The performance of the proposed architecture was evaluated against the architecture in which video applications perform rate adaptation without being managed by the admission control component. The results showed that our architecture optimises the mean Mean Opinion Score (MOS) and number of successful decoded video sessions without compromising the delay. The algorithms proposed in this study were implemented and evaluated using Network Simulator-version 2 (NS-2), MATLAB, Evalvid and Evalvid-RA. These software tools were selected based on their use in similar studies and availability at the university. Data obtained from the simulations was analysed with analysis of variance (ANOVA) and the Cumulative Distribution Functions (CDF) for the performance metrics were calculated. The proposed architecture will contribute to the preparation for the massive growth of video traffic. The mathematical models of the proposed algorithms contribute to the research community

    Advanced Wireless LAN

    Get PDF
    The past two decades have witnessed starling advances in wireless LAN technologies that were stimulated by its increasing popularity in the home due to ease of installation, and in commercial complexes offering wireless access to their customers. This book presents some of the latest development status of wireless LAN, covering the topics on physical layer, MAC layer, QoS and systems. It provides an opportunity for both practitioners and researchers to explore the problems that arise in the rapidly developed technologies in wireless LAN
    corecore