98,505 research outputs found

    FPGA Implementation of Convolutional Neural Networks with Fixed-Point Calculations

    Full text link
    Neural network-based methods for image processing are becoming widely used in practical applications. Modern neural networks are computationally expensive and require specialized hardware, such as graphics processing units. Since such hardware is not always available in real life applications, there is a compelling need for the design of neural networks for mobile devices. Mobile neural networks typically have reduced number of parameters and require a relatively small number of arithmetic operations. However, they usually still are executed at the software level and use floating-point calculations. The use of mobile networks without further optimization may not provide sufficient performance when high processing speed is required, for example, in real-time video processing (30 frames per second). In this study, we suggest optimizations to speed up computations in order to efficiently use already trained neural networks on a mobile device. Specifically, we propose an approach for speeding up neural networks by moving computation from software to hardware and by using fixed-point calculations instead of floating-point. We propose a number of methods for neural network architecture design to improve the performance with fixed-point calculations. We also show an example of how existing datasets can be modified and adapted for the recognition task in hand. Finally, we present the design and the implementation of a floating-point gate array-based device to solve the practical problem of real-time handwritten digit classification from mobile camera video feed

    Compact gml: merging mobile computing and mobile cartography

    Get PDF
    The use of portable devices is moving from "Wireless Applications", typically implemented as browsing-on-the-road, to "Mobile Computing", which aims to exploit increasing processing power of consumer devices. As users get connected with smartphones and PDAs, they look for geographic information and location-aware services. While browser-based approaches have been explored (using static images or graphics formats such as Mobile SVG), a data model tailored for local computation on mobile devices is still missing. This paper presents the Compact Geographic Markup Language (cGML) that enables design and development of specific purpose GIS applications for portable consumer devices where a cGML document can be used as a spatial query result as well

    EgoFace: Egocentric Face Performance Capture and Videorealistic Reenactment

    No full text
    Face performance capture and reenactment techniques use multiple cameras and sensors, positioned at a distance from the face or mounted on heavy wearable devices. This limits their applications in mobile and outdoor environments. We present EgoFace, a radically new lightweight setup for face performance capture and front-view videorealistic reenactment using a single egocentric RGB camera. Our lightweight setup allows operations in uncontrolled environments, and lends itself to telepresence applications such as video-conferencing from dynamic environments. The input image is projected into a low dimensional latent space of the facial expression parameters. Through careful adversarial training of the parameter-space synthetic rendering, a videorealistic animation is produced. Our problem is challenging as the human visual system is sensitive to the smallest face irregularities that could occur in the final results. This sensitivity is even stronger for video results. Our solution is trained in a pre-processing stage, through a supervised manner without manual annotations. EgoFace captures a wide variety of facial expressions, including mouth movements and asymmetrical expressions. It works under varying illuminations, background, movements, handles people from different ethnicities and can operate in real time

    Semi-automated creation of converged iTV services: From macromedia director simulations to services ready for broadcast

    Get PDF
    While sound and video may capture viewers’ attention, interaction can captivate them. This has not been available prior to the advent of Digital Television. In fact, what lies at the heart of the Digital Television revolution is this new type of interactive content, offered in the form of interactive Television (iTV) services. On top of that, the new world of converged networks has created a demand for a new type of converged services on a range of mobile terminals (Tablet PCs, PDAs and mobile phones). This paper aims at presenting a new approach to service creation that allows for the semi-automatic translation of simulations and rapid prototypes created in the accessible desktop multimedia authoring package Macromedia Director into services ready for broadcast. This is achieved by a series of tools that de-skill and speed-up the process of creating digital TV user interfaces (UI) and applications for mobile terminals. The benefits of rapid prototyping are essential for the production of these new types of services, and are therefore discussed in the first section of this paper. In the following sections, an overview of the operation of content, service, creation and management sub-systems is presented, which illustrates why these tools compose an important and integral part of a system responsible of creating, delivering and managing converged broadcast and telecommunications services. The next section examines a number of metadata languages candidates for describing the iTV services user interface and the schema language adopted in this project. A detailed description of the operation of the two tools is provided to offer an insight of how they can be used to de-skill and speed-up the process of creating digital TV user interfaces and applications for mobile terminals. Finally, representative broadcast oriented and telecommunication oriented converged service components are also introduced, demonstrating how these tools have been used to generate different types of services

    Gesture-Based Input for Drawing Schematics on a Mobile Device

    Get PDF
    We present a system for drawing metro map style schematics using a gesture-based interface. This work brings together techniques in gesture recognition on touch-sensitive devices with research in schematic layout of networks. The software allows users to create and edit schematic networks, and provides an automated layout method for improving the appearance of the schematic. A case study using the metro map metaphor to visualize social networks and web site structure is described

    Using Sound to Enhance Users’ Experiences of Mobile Applications

    Get PDF
    The latest smartphones with GPS, electronic compass, directional audio, touch screens etc. hold potentials for location based services that are easier to use compared to traditional tools. Rather than interpreting maps, users may focus on their activities and the environment around them. Interfaces may be designed that let users search for information by simply pointing in a direction. Database queries can be created from GPS location and compass direction data. Users can get guidance to locations through pointing gestures, spatial sound and simple graphics. This article describes two studies testing prototypic applications with multimodal user interfaces built on spatial audio, graphics and text. Tests show that users appreciated the applications for their ease of use, for being fun and effective to use and for allowing users to interact directly with the environment rather than with abstractions of the same. The multimodal user interfaces contributed significantly to the overall user experience

    Reducing power consumption of mobile thin client devices

    Get PDF
    • …
    corecore