96 research outputs found

    Architectures for Real-Time Automatic Sign Language Recognition on Resource-Constrained Device

    Get PDF
    Powerful, handheld computing devices have proliferated among consumers in recent years. Combined with new cameras and sensors capable of detecting objects in three-dimensional space, new gesture-based paradigms of human computer interaction are becoming available. One possible application of these developments is an automated sign language recognition system. This thesis reviews the existing body of work regarding computer recognition of sign language gestures as well as the design of systems for speech recognition, a similar problem. Little work has been done to apply the well-known architectural patterns of speech recognition systems to the domain of sign language recognition. This work creates a functional prototype of such a system, applying three architectures seen in speech recognition systems, using a Hidden Markov classifier with 75-90% accuracy. A thorough search of the literature indicates that no cloud-based system has yet been created for sign language recognition and this is the first implementation of its kind. Accordingly, there have been no empirical performance analyses regarding a cloud-based Automatic Sign Language Recognition (ASLR) system, which this research provides. The performance impact of each architecture, as well as the data interchange format, is then measured based on response time, CPU, memory, and network usage across an increasing vocabulary of sign language gestures. The results discussed herein suggest that a partially-offloaded client-server architecture, where feature extraction occurs on the client device and classification occurs in the cloud, is the ideal selection for all but the smallest vocabularies. Additionally, the results indicate that for the potentially large data sets transmitted for 3D gesture classification, a fast binary interchange protocol such as Protobuf has vastly superior performance to a text-based protocol such as JSON

    Code offloading in opportunistic computing

    Get PDF
    With the advent of cloud computing, applications are no longer tied to a single device, but they can be migrated to a high-performance machine located in a distant data center. The key advantage is the enhancement of performance and consequently, the users experience. This activity is commonly referred computational offloading and it has been strenuously investigated in the past years. The natural candidate for computational offloading is the cloud, but recent results point out the hidden costs of cloud reliance in terms of latency and energy; Cuervo et. al. illustrates the limitations on cloud-based computational offloading based on WANs latency times. The dissertation confirms the results of Cuervo et. al. and illustrates more use cases where the cloud may not be the right choice. This dissertation addresses the following question: is it possible to build a novel approach for offloading the computation that overcomes the limitations of the state-of-the-art? In other words, is it possible to create a computational offloading solution that is able to use local resources when the Cloud is not usable, and remove the strong bond with the local infrastructure? To this extent, I propose a novel paradigm for computation offloading named anyrun computing, whose goal is to use any piece of higher-end hardware (locally or remotely accessible) to offloading a portion of the application. With anyrun computing I removed the boundaries that tie the solution to an infrastructure by adding locally available devices to augment the chances to succeed in offloading. To achieve the goals of the dissertation it is fundamental to have a clear view of all the steps that take part in the offloading process. To this extent, I firstly provided a categorization of such activities combined with their interactions and assessed the impact on the system. The outcome of the analysis is the mapping to the problem to a combinatorial optimization problem that is notoriously known to be NP-Hard. There are a set of well-known approaches to solving such kind of problems, but in this scenario, they cannot be used because they require a global view that can be only maintained by a centralized infrastructure. Thus, local solutions are needed. Moving further, to empirically tackle the anyrun computing paradigm, I propose the anyrun computing framework (ARC), a novel software framework whose objective is to decide whether to offload or not to any resource-rich device willing to lend assistance is advantageous compared to local execution with respect to a rich array of performance dimensions. The core of ARC is the nference nodel which receives a rich set of information about the available remote devices from the SCAMPI opportunistic computing framework developed within the European project SCAMPI, and employs the information to profile a given device, in other words, it decides whether offloading is advantageous compared to local execution, i.e. whether it can reduce the local footprint compared to local execution in the dimensions of interest (CPU and RAM usage, execution time, and energy consumption). To empirically evaluate ARC I presented a set of experimental results on the cloud, cloudlet, and opportunistic domain. In the cloud domain, I used the state of the art in cloud solutions over a set of significant benchmark problems and with three WANs access technologies (i.e. 3G, 4G, and high-speed WAN). The main outcome is that the cloud is an appealing solution for a wide variety of problems, but there is a set of circumstances where the cloud performs poorly. Moreover, I have empirically shown the limitations of cloud-based approaches, specifically, In some circumstances, problems with high transmission costs tend to perform poorly, unless they have high computational needs. The second part of the evaluation is done in opportunistic/cloudlet scenarios where I used my custom-made testbed to compare ARC and MAUI, the state of the art in computation offloading. To this extent, I have performed two distinct experiments: the first with a cloudlet environment and the second with an opportunistic environment. The key outcome is that ARC virtually matches the performances of MAUI (in terms of energy savings) in cloudlet environment, but it improves them by a 50% to 60% in the opportunistic domain

    Jitsu: Just-in-time summoning of unikernel

    Get PDF
    Network latency is a problem for all cloud services. It can be mitigated by moving computation out of remote datacenters by rapidly instantiating local services near the user. This requires an embedded cloud platform on which to deploy multiple applications securely and quickly. We present Jitsu, a new Xen toolstack that satisfies the demands of secure multi-tenant isolation on resource-constrained embedded ARM devices. It does this by using unikernels: lightweight, compact, single address space, memory-safe virtual machines (VMs) written in a high-level language. Using fast shared memory channels, Jitsu provides a directory service that launches unikernels in response to network traffic and masks boot latency. Our evaluation shows Jitsu to be a power-efficient and responsive platform for hosting cloud services in the edge network while preserving the strong isolation guarantees of a type-1 hypervisor.The research leading to these results received funding from the European Union’s Seventh Framework Programme FP7/2007–2013 under the Trilogy 2 project (grant agreement no. 317756), and the User Centric Networking project, (grant agreement no. 611001), and the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contract FA8750-11-C-0249.This is the author accepted manuscript. The final version is available from USENIX via https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/madhavapedd

    Code offloading on real-time multimedia systems: a framework for handling code mobility and code offloading in a QoS aware environment

    Get PDF
    Actualmente, os smartphones e outros dispositivos móveis têm vindo a ser dotados com cada vez maior poder computacional, sendo capazes de executar um vasto conjunto de aplicações desde simples programas de para tirar notas até sofisticados programas de navegação. Porém, mesmo com a evolução do seu hardware, os actuais dispositivos móveis ainda não possuem as mesmas capacidades que os computadores de mesa ou portáteis. Uma possível solução para este problema é distribuir a aplicação, executando partes dela no dispositivo local e o resto em outros dispositivos ligados à rede. Adicionalmente, alguns tipos de aplicações como aplicações multimédia, jogos electrónicos ou aplicações de ambiente imersivos possuem requisitos em termos de Qualidade de Serviço, particularmente de tempo real. Ao longo desta tese é proposto um sistema de execução de código remota para sistemas distribuídos com restrições de tempo-real. A arquitectura proposta adapta-se a sistemas que necessitem de executar periodicamente e em paralelo mesmo conjunto de funções com garantias de tempo real, mesmo desconhecendo os tempos de execução das referidas funções. A plataforma proposta foi desenvolvida para sistemas móveis capazes de executar o Sistema Operativo Android.Smartphones and other mobile devices are becoming more powerful and are capable of executing several applications in a concurrent manner. Although the hardware capabilities of mobile devices are increasing in an unprecedented way, they still do not possess the same features and resources of a common desktop or laptop PC. A potential solution for this limitation might be to distribute an application by running some of its parts locally while running the remaining parts on other devices. Additionally, there are several types of applications in domains such as multimedia, gaming or immersive environments that require soft real-time constraints which have to be guaranteed. In this work we are targeting highly dynamic distributed systems with Quality of Service (QoS) constraints, where the traditional models of computation are not sufficient to handle the users’ or applications’ requests. Therefore, new models of computation are needed to overcome the above limitations in order to satisfy the applications’ or users’ requirements. Code offloading techniques allied with resource management seem very promising as each node may use neighbour nodes to request for help in order to perform demanding computations that cannot be done locally. In this demanding context, a full-fledged framework was developed with the objective of integrating code offloading techniques on top of a middleware framework that provides QoS and real-time guarantees to the applications. This paper describes the implementation of the above-mentioned framework in the Android platform as well as a proof-of-concept application to demonstrate the most important concepts of code offloading, QoS and real-time scheduling

    Data-intensive Systems on Modern Hardware : Leveraging Near-Data Processing to Counter the Growth of Data

    Get PDF
    Over the last decades, a tremendous change toward using information technology in almost every daily routine of our lives can be perceived in our society, entailing an incredible growth of data collected day-by-day on Web, IoT, and AI applications. At the same time, magneto-mechanical HDDs are being replaced by semiconductor storage such as SSDs, equipped with modern Non-Volatile Memories, like Flash, which yield significantly faster access latencies and higher levels of parallelism. Likewise, the execution speed of processing units increased considerably as nowadays server architectures comprise up to multiple hundreds of independently working CPU cores along with a variety of specialized computing co-processors such as GPUs or FPGAs. However, the burden of moving the continuously growing data to the best fitting processing unit is inherently linked to today’s computer architecture that is based on the data-to-code paradigm. In the light of Amdahl's Law, this leads to the conclusion that even with today's powerful processing units, the speedup of systems is limited since the fraction of parallel work is largely I/O-bound. Therefore, throughout this cumulative dissertation, we investigate the paradigm shift toward code-to-data, formally known as Near-Data Processing (NDP), which relieves the contention on the I/O bus by offloading processing to intelligent computational storage devices, where the data is originally located. Firstly, we identified Native Storage Management as the essential foundation for NDP due to its direct control of physical storage management within the database. Upon this, the interface is extended to propagate address mapping information and to invoke NDP functionality on the storage device. As the former can become very large, we introduce Physical Page Pointers as one novel NDP abstraction for self-contained immutable database objects. Secondly, the on-device navigation and interpretation of data are elaborated. Therefore, we introduce cross-layer Parsers and Accessors as another NDP abstraction that can be executed on the heterogeneous processing capabilities of modern computational storage devices. Thereby, the compute placement and resource configuration per NDP request is identified as a major performance criteria. Our experimental evaluation shows an improvement in the execution durations of 1.4x to 2.7x compared to traditional systems. Moreover, we propose a framework for the automatic generation of Parsers and Accessors on FPGAs to ease their application in NDP. Thirdly, we investigate the interplay of NDP and modern workload characteristics like HTAP. Therefore, we present different offloading models and focus on an intervention-free execution. By propagating the Shared State with the latest modifications of the database to the computational storage device, it is able to process data with transactional guarantees. Thus, we achieve to extend the design space of HTAP with NDP by providing a solution that optimizes for performance isolation, data freshness, and the reduction of data transfers. In contrast to traditional systems, we experience no significant drop in performance when an OLAP query is invoked but a steady and 30% faster throughput. Lastly, in-situ result-set management and consumption as well as NDP pipelines are proposed to achieve flexibility in processing data on heterogeneous hardware. As those produce final and intermediary results, we continue investigating their management and identified that an on-device materialization comes at a low cost but enables novel consumption modes and reuse semantics. Thereby, we achieve significant performance improvements of up to 400x by reusing once materialized results multiple times

    Secure migration of WebAssembly-based mobile agents between secure enclaves

    Get PDF
    Cryptography and security protocols are today commonly used to protect data at-rest and in-transit. In contrast, protecting data in-use has seen only limited adoption. Secure data transfer methods employed today rarely provide guarantees regarding the trustworthiness of the software and hardware at the communication endpoints. The field of study that addresses these issues is called Trusted or Confidential Computing and relies on the use of hardware-based techniques. These techniques aim to isolate critical data and its processing from the rest of the system. More specifically, it investigates the use of hardware isolated Secure Execution Environments (SEEs) where applications cannot be tampered with during operation. Over the past few decades, several implementations of SEEs have been introduced, each based on a different hardware architecture. However, lately, the trend is to move towards architecture-independent SEEs. As part of this, Huawei research project is developing a secure enclave framework that enables secure execution and migration of applications (mobile agents), regardless of the underlying architecture. This thesis contributes to the development of the framework by participating in the design and implementation of a secure migration scheme for the mobile agents. The goal is a scheme wherein it is possible to transfer the mobile agent without compromising the security guarantees provided by SEEs. Further, the thesis also provides performance measurements of the migration scheme implemented in a proof of concept of the framework

    Increasing the Performance and Predictability of the Code Execution on an Embedded Java Platform

    Get PDF
    This thesis explores the execution of object-oriented code on an embedded Java platform. It presents established and derives new approaches for the implementation of high-level object-oriented functionality and commonly expected system services. The goal of the developed techniques is the provision of the architectural base for an efficient and predictable code execution. The research vehicle of this thesis is the Java-programmed SHAP platform. It consists of its platform tool chain and the highly-customizable SHAP bytecode processor. SHAP offers a fully operational embedded CLDC environment, in which the proposed techniques have been implemented, verified, and evaluated. Two strands are followed to achieve the goal of this thesis. First of all, the sequential execution of bytecode is optimized through a joint effort of an optimizing offline linker and an on-chip application loader. Additionally, SHAP pioneers a reference coloring mechanism, which enables a constant-time interface method dispatch that need not be backed a large sparse dispatch table. Secondly, this thesis explores the implementation of essential system services within designated concurrent hardware modules. This effort is necessary to decouple the computational progress of the user application from the interference induced by time-sharing software implementations of these services. The concrete contributions comprise a spill-free, on-chip stack; a predictable method cache; and a concurrent garbage collection. Each approached means is described and evaluated after the relevant state of the art has been reviewed. This review is not limited to preceding small embedded approaches but also includes techniques that have proven successful on larger-scale platforms. The other way around, the chances that these platforms may benefit from the techniques developed for SHAP are discussed
    • …
    corecore