263 research outputs found

    GPU Accelerated protocol analysis for large and long-term traffic traces

    Get PDF
    This thesis describes the design and implementation of GPF+, a complete general packet classification system developed using Nvidia CUDA for Compute Capability 3.5+ GPUs. This system was developed with the aim of accelerating the analysis of arbitrary network protocols within network traffic traces using inexpensive, massively parallel commodity hardware. GPF+ and its supporting components are specifically intended to support the processing of large, long-term network packet traces such as those produced by network telescopes, which are currently difficult and time consuming to analyse. The GPF+ classifier is based on prior research in the field, which produced a prototype classifier called GPF, targeted at Compute Capability 1.3 GPUs. GPF+ greatly extends the GPF model, improving runtime flexibility and scalability, whilst maintaining high execution efficiency. GPF+ incorporates a compact, lightweight registerbased state machine that supports massively-parallel, multi-match filter predicate evaluation, as well as efficient arbitrary field extraction. GPF+ tracks packet composition during execution, and adjusts processing at runtime to avoid redundant memory transactions and unnecessary computation through warp-voting. GPF+ additionally incorporates a 128-bit in-thread cache, accelerated through register shuffling, to accelerate access to packet data in slow GPU global memory. GPF+ uses a high-level DSL to simplify protocol and filter creation, whilst better facilitating protocol reuse. The system is supported by a pipeline of multi-threaded high-performance host components, which communicate asynchronously through 0MQ messaging middleware to buffer, index, and dispatch packet data on the host system. The system was evaluated using high-end Kepler (Nvidia GTX Titan) and entry level Maxwell (Nvidia GTX 750) GPUs. The results of this evaluation showed high system performance, limited only by device side IO (600MBps) in all tests. GPF+ maintained high occupancy and device utilisation in all tests, without significant serialisation, and showed improved scaling to more complex filter sets. Results were used to visualise captures of up to 160 GB in seconds, and to extract and pre-filter captures small enough to be easily analysed in applications such as Wireshark

    Simulation models of shared-memory multiprocessor systems

    Get PDF

    Dagstuhl News January - December 2008

    Get PDF
    "Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic

    Developing a distributed electronic health-record store for India

    Get PDF
    The DIGHT project is addressing the problem of building a scalable and highly available information store for the Electronic Health Records (EHRs) of the over one billion citizens of India

    Pre-validation of SoC via hardware and software co-simulation

    Get PDF
    Abstract. System-on-chips (SoCs) are complex entities consisting of multiple hardware and software components. This complexity presents challenges in their design, verification, and validation. Traditional verification processes often test hardware models in isolation until late in the development cycle. As a result, cooperation between hardware and software development is also limited, slowing down bug detection and fixing. This thesis aims to develop, implement, and evaluate a co-simulation-based pre-validation methodology to address these challenges. The approach allows for the early integration of hardware and software, serving as a natural intermediate step between traditional hardware model verification and full system validation. The co-simulation employs a QEMU CPU emulator linked to a register-transfer level (RTL) hardware model. This setup enables the execution of software components, such as device drivers, on the target instruction set architecture (ISA) alongside cycle-accurate RTL hardware models. The thesis focuses on two primary applications of co-simulation. Firstly, it allows software unit tests to be run in conjunction with hardware models, facilitating early communication between device drivers, low-level software, and hardware components. Secondly, it offers an environment for using software in functional hardware verification. A significant advantage of this approach is the early detection of integration errors. Software unit tests can be executed at the IP block level with actual hardware models, a task previously only possible with costly system-level prototypes. This enables earlier collaboration between software and hardware development teams and smoothens the transition to traditional system-level validation techniques.JÀrjestelmÀpiirin esivalidointi laitteiston ja ohjelmiston yhteissimulaatiolla. TiivistelmÀ. JÀrjestelmÀpiirit (SoC) ovat monimutkaisia kokonaisuuksia, jotka koostuvat useista laitteisto- ja ohjelmistokomponenteista. TÀmÀ monimutkaisuus asettaa haasteita niiden suunnittelulle, varmennukselle ja validoinnille. Perinteiset varmennusprosessit testaavat usein laitteistomalleja eristyksissÀ kehityssyklin loppuvaiheeseen saakka. TÀmÀn myötÀ myös yhteistyö laitteisto- ja ohjelmistokehityksen vÀlillÀ on vÀhÀistÀ, mikÀ hidastaa virheiden tunnistamista ja korjausta. TÀmÀn diplomityön tavoitteena on kehittÀÀ, toteuttaa ja arvioida laitteisto-ohjelmisto-yhteissimulointiin perustuva esivalidointimenetelmÀ nÀiden haasteiden ratkaisemiseksi. MenetelmÀ mahdollistaa laitteiston ja ohjelmiston varhaisen integroinnin, toimien luonnollisena vÀlietappina perinteisen laitteistomallin varmennuksen ja koko jÀrjestelmÀn validoinnin vÀlillÀ. Yhteissimulointi kÀyttÀÀ QEMU suoritinemulaattoria, joka on yhdistetty rekisterinsiirtotason (RTL) laitteistomalliin. TÀmÀ mahdollistaa ohjelmistokomponenttien, kuten laiteajureiden, suorittamisen kohdejÀrjestelmÀn kÀskysarja-arkkitehtuurilla (ISA) yhdessÀ kellosyklitarkkojen RTL laitteistomallien kanssa. Työ keskittyy kahteen yhteissimulaation pÀÀsovellukseen. EnsinnÀkin se mahdollistaa ohjelmiston yksikkötestien suorittamisen laitteistomallien kanssa, varmistaen kommunikaation laiteajurien, matalan tason ohjelmiston ja laitteistokomponenttien vÀlillÀ. Toiseksi se tarjoaa ympÀristön ohjelmiston kÀyttÀmiseen toiminnallisessa laitteiston varmennuksessa. MerkittÀvÀ etu tÀstÀ lÀhestymistavasta on integraatiovirheiden varhainen havaitseminen. Ohjelmiston yksikkötestejÀ voidaan suorittaa jo IP-lohkon tasolla oikeilla laitteistomalleilla, mikÀ on aiemmin ollut mahdollista vain kalliilla jÀrjestelmÀtason prototyypeillÀ. TÀmÀ mahdollistaa aikaisemman ohjelmisto- ja laitteistokehitystiimien vÀlisen yhteistyön ja helpottaa siirtymistÀ perinteisiin jÀrjestelmÀtason validointimenetelmiin

    Embedded System Design

    Get PDF
    A unique feature of this open access textbook is to provide a comprehensive introduction to the fundamental knowledge in embedded systems, with applications in cyber-physical systems and the Internet of things. It starts with an introduction to the field and a survey of specification models and languages for embedded and cyber-physical systems. It provides a brief overview of hardware devices used for such systems and presents the essentials of system software for embedded systems, including real-time operating systems. The author also discusses evaluation and validation techniques for embedded systems and provides an overview of techniques for mapping applications to execution platforms, including multi-core platforms. Embedded systems have to operate under tight constraints and, hence, the book also contains a selected set of optimization techniques, including software optimization techniques. The book closes with a brief survey on testing. This fourth edition has been updated and revised to reflect new trends and technologies, such as the importance of cyber-physical systems (CPS) and the Internet of things (IoT), the evolution of single-core processors to multi-core processors, and the increased importance of energy efficiency and thermal issues

    Telecommunications Networks

    Get PDF
    This book guides readers through the basics of rapidly emerging networks to more advanced concepts and future expectations of Telecommunications Networks. It identifies and examines the most pressing research issues in Telecommunications and it contains chapters written by leading researchers, academics and industry professionals. Telecommunications Networks - Current Status and Future Trends covers surveys of recent publications that investigate key areas of interest such as: IMS, eTOM, 3G/4G, optimization problems, modeling, simulation, quality of service, etc. This book, that is suitable for both PhD and master students, is organized into six sections: New Generation Networks, Quality of Services, Sensor Networks, Telecommunications, Traffic Engineering and Routing

    Embedded System Design

    Get PDF
    A unique feature of this open access textbook is to provide a comprehensive introduction to the fundamental knowledge in embedded systems, with applications in cyber-physical systems and the Internet of things. It starts with an introduction to the field and a survey of specification models and languages for embedded and cyber-physical systems. It provides a brief overview of hardware devices used for such systems and presents the essentials of system software for embedded systems, including real-time operating systems. The author also discusses evaluation and validation techniques for embedded systems and provides an overview of techniques for mapping applications to execution platforms, including multi-core platforms. Embedded systems have to operate under tight constraints and, hence, the book also contains a selected set of optimization techniques, including software optimization techniques. The book closes with a brief survey on testing. This fourth edition has been updated and revised to reflect new trends and technologies, such as the importance of cyber-physical systems (CPS) and the Internet of things (IoT), the evolution of single-core processors to multi-core processors, and the increased importance of energy efficiency and thermal issues

    Simulation Native des SystÚmes Multiprocesseurs sur Puce à l'aide de la Virtualisation Assistée par le Matériel

    Get PDF
    L'intĂ©gration de plusieurs processeurs hĂ©tĂ©rogĂšnes en un seul systĂšme sur puce (SoC) est une tendance claire dans les systĂšmes embarquĂ©s. La conception et la vĂ©rification de ces systĂšmes nĂ©cessitent des plateformes rapides de simulation, et faciles Ă  construire. Parmi les approches de simulation de logiciels, la simulation native est un bon candidat grĂące Ă  l'exĂ©cution native de logiciel embarquĂ© sur la machine hĂŽte, ce qui permet des simulations Ă  haute vitesse, sans nĂ©cessiter le dĂ©veloppement de simulateurs d'instructions. Toutefois, les techniques de simulation natives existantes exĂ©cutent le logiciel de simulation dans l'espace de mĂ©moire partagĂ©e entre le matĂ©riel modĂ©lisĂ© et le systĂšme d'exploitation hĂŽte. Il en rĂ©sulte de nombreux problĂšmes, par exemple les conflits l'espace d'adressage et les chevauchements de mĂ©moire ainsi que l'utilisation des adresses de la machine hĂŽte plutĂŽt des celles des plates-formes matĂ©rielles cibles. Cela rend pratiquement impossible la simulation native du code existant fonctionnant sur la plate-forme cible. Pour surmonter ces problĂšmes, nous proposons l'ajout d'une couche transparente de traduction de l'espace adressage pour sĂ©parer l'espace d'adresse cible de celui du simulateur de hĂŽte. Nous exploitons la technologie de virtualisation assistĂ©e par matĂ©riel (HAV pour Hardware-Assisted Virtualization) Ă  cet effet. Cette technologie est maintenant disponibles sur plupart de processeurs grande public Ă  usage gĂ©nĂ©ral. Les expĂ©riences montrent que cette solution ne dĂ©grade pas la vitesse de simulation native, tout en gardant la possibilitĂ© de rĂ©aliser l'Ă©valuation des performances du logiciel simulĂ©. La solution proposĂ©e est Ă©volutive et flexible et nous fournit les preuves nĂ©cessaires pour appuyer nos revendications avec des solutions de simulation multiprocesseurs et hybrides. Nous abordons Ă©galement la simulation d'exĂ©cutables cross- compilĂ©s pour les processeurs VLIW (Very Long Instruction Word) en utilisant une technique de traduction binaire statique (SBT) pour gĂ©nĂ©rĂ© le code natif. Ainsi il n'est pas nĂ©cessaire de faire de traduction Ă  la volĂ©e ou d'interprĂ©tation des instructions. Cette approche est intĂ©ressante dans les situations oĂč le code source n'est pas disponible ou que la plate-forme cible n'est pas supportĂ© par les compilateurs reciblable, ce qui est gĂ©nĂ©ralement le cas pour les processeurs VLIW. Les simulateurs gĂ©nĂ©rĂ©s s'exĂ©cutent au-dessus de notre plate-forme basĂ©e sur le HAV et modĂ©lisent les processeurs de la sĂ©rie C6x de Texas Instruments (TI). Les rĂ©sultats de simulation des binaires pour VLIW montrent une accĂ©lĂ©ration de deux ordres de grandeur par rapport aux simulateurs prĂ©cis au cycle prĂšs.Integration of multiple heterogeneous processors into a single System-on-Chip (SoC) is a clear trend in embedded systems. Designing and verifying these systems require high-speed and easy-to-build simulation platforms. Among the software simulation approaches, native simulation is a good candidate since the embedded software is executed natively on the host machine, resulting in high speed simulations and without requiring instruction set simulator development effort. However, existing native simulation techniques execute the simulated software in memory space shared between the modeled hardware and the host operating system. This results in many problems, including address space conflicts and overlaps as well as the use of host machine addresses instead of the target hardware platform ones. This makes it practically impossible to natively simulate legacy code running on the target platform. To overcome these issues, we propose the addition of a transparent address space translation layer to separate the target address space from that of the host simulator. We exploit the Hardware-Assisted Virtualization (HAV) technology for this purpose, which is now readily available on almost all general purpose processors. Experiments show that this solution does not degrade the native simulation speed, while keeping the ability to accomplish software performance evaluation. The proposed solution is scalable as well as flexible and we provide necessary evidence to support our claims with multiprocessor and hybrid simulation solutions. We also address the simulation of cross-compiled Very Long Instruction Word (VLIW) executables, using a Static Binary Translation (SBT) technique to generated native code that does not require run-time translation or interpretation support. This approach is interesting in situations where either the source code is not available or the target platform is not supported by any retargetable compilation framework, which is usually the case for VLIW processors. The generated simulators execute on top of our HAV based platform and model the Texas Instruments (TI) C6x series processors. Simulation results for VLIW binaries show a speed-up of around two orders of magnitude compared to the cycle accurate simulators.SAVOIE-SCD - Bib.Ă©lectronique (730659901) / SudocGRENOBLE1/INP-Bib.Ă©lectronique (384210012) / SudocGRENOBLE2/3-Bib.Ă©lectronique (384219901) / SudocSudocFranceF

    Proxy compilation for Java via a code migration technique

    Get PDF
    There is an increasing trend that intermediate representations (IRs) are used to deliver programs in more and more languages, such as Java. Although Java can provide many advantages, including a wider portability and better optimisation opportunities on execution, it introduces extra overhead by requiring an IR translation for the program execution. For maximum execution performance, an optimising compiler is placed in the runtime to selectively optimise code regions regarded as “hotspots”. This common approach has been effectively deployed in many implementation of programming languages. However, the computational resources demanded by this approach made it less efficient, or even difficult to deploy directly in a resourceconstrained environment. One implementation approach is to use a remote compilation technique to support compilation during the execution. The work presented in this dissertation supports the thesis that execution performance can be improved by the use of efficient optimising compilation by using a proxy dynamic optimising compiler. After surveying various approaches to the design and implementation of remote compilation, a proxy compilation system called Apus is defined. To demonstrate the effectiveness of using a dynamic optimising compiler as a proxy compiler, a complete proxy compilation system is written based on a research-oriented Java VirtualMachine (JVM). The proxy compilation system is discussed in detail, showing how to deliver remote binaries and manage a cache of binaries by using a code migration approach. The proxy compilation client shows how the proxy compilation service is integrated with the selective optimisation system to maximise execution performance. The results of empirical measurements of the system are given, showing the efficiency of code optimisation from either the proxy compilation service and a local binary cache. The conclusion of this work is that Java execution performance can be improved by efficient optimising compilation with a proxy compilation service by using a code migration technique
    • 

    corecore