3,014 research outputs found

    Extreme clicking for efficient object annotation

    Get PDF
    Manually annotating object bounding boxes is central to building computer vision datasets, and it is very time consuming (annotating ILSVRC [53] took 35s for one high-quality box [62]). It involves clicking on imaginary corners of a tight box around the object. This is difficult as these corners are often outside the actual object and several adjustments are required to obtain a tight box. We propose extreme clicking instead: we ask the annotator to click on four physical points on the object: the top, bottom, left- and right-most points. This task is more natural and these points are easy to find. We crowd-source extreme point annotations for PASCAL VOC 2007 and 2012 and show that (1) annotation time is only 7s per box, 5x faster than the traditional way of drawing boxes [62]; (2) the quality of the boxes is as good as the original ground-truth drawn the traditional way; (3) detectors trained on our annotations are as accurate as those trained on the original ground-truth. Moreover, our extreme clicking strategy not only yields box coordinates, but also four accurate boundary points. We show (4) how to incorporate them into GrabCut to obtain more accurate segmentations than those delivered when initializing it from bounding boxes; (5) semantic segmentations models trained on these segmentations outperform those trained on segmentations derived from bounding boxes.Comment: ICCV 201

    Human motion segmentation using active shape models

    Get PDF
    Human motion analysis from images is meticulously related to thedevelopment of computational techniques capable of automatically identifying,tracking and analyzing relevant structures of the body. This work explores theidentification of such structures in images, which is the first step of any computationalsystem designed to analyze human motion. A widely used database(CASIA Gait Database) was used to build a Point Distribution Model (PDM) of thestructure of the human body. The training dataset was composed of 14 subjectswalking in four directions, and each shape was represented by a set of 113 labelledlandmark points. These points were composed of 100 contour points automaticallyextracted from the silhouette combined with an additional 13 anatomical pointsfrom elbows, knees and feet manually annotated. The PDM was later used in theconstruction of an Active Shape Model, which combines the shape model with graylevel profiles, in order to segment the modelled human body in new images. Theexperiments with this segmentation technique revealed very encouraging results asit was able to gather the necessary data of subjects walking in different directionsusing just one segmentation model

    Real-Time Markerless Tracking the Human Hands for 3D Interaction

    Get PDF
    This thesis presents methods for enabling suitable human computer interaction using only movements of the bare human hands in free space. This kind of interaction is natural and intuitive, particularly because actions familiar to our everyday life can be reflected. Furthermore, the input is contact-free which is of great advantage e.g. in medical applications due to hygiene factors. For enabling the translation of hand movements to control signals an automatic method for tracking the pose and/or posture of the hand is needed. In this context the simultaneous recognition of both hands is desirable to allow for more natural input. The first contribution of this thesis is a novel video-based method for real-time detection of the positions and orientations of both bare human hands in four different predefined postures, respectively. Based on such a system novel interaction interfaces can be developed. However, the design of such interfaces is a non-trivial task. Additionally, the development of novel interaction techniques is often mandatory in order to enable the design of efficient and easily operable interfaces. To this end, several novel interaction techniques are presented and investigated in this thesis, which solve existing problems and substantially improve the applicability of such a new device. These techniques are not restricted to this input instrument and can also be employed to improve the handling of other interaction devices. Finally, several new interaction interfaces are described and analyzed to demonstrate possible applications in specific interaction scenarios.Markerlose Verfolgung der menschlichen Hände in Echtzeit für 3D Interaktion In der vorliegenden Arbeit werden Verfahren dargestellt, die sinnvolle Mensch- Maschine-Interaktionen nur durch Bewegungen der bloßen Hände in freiem Raum ermöglichen. Solche "natürlichen" Interaktionen haben den besonderen Vorteil, dass alltägliche und vertraute Handlungen in die virtuelle Umgebung übertragen werden können. Außerdem werden auf diese Art berührungslose Eingaben ermöglicht, nützlich z.B. wegen hygienischer Aspekte im medizinischen Bereich. Um Handbewegungen in Steuersignale umsetzen zu können, ist zunächst ein automatisches Verfahren zur Erkennung der Lage und/oder der Art der mit der Hand gebildeten Geste notwendig. Dabei ist die gleichzeitige Erfassung beider Hände wünschenswert, um die Eingaben möglichst natürlich gestalten zu können. Der erste Beitrag dieser Arbeit besteht aus einer neuen videobasierten Methode zur unmittelbaren Erkennung der Positionen und Orientierungen beider Hände in jeweils vier verschiedenen, vordefinierten Gesten. Basierend auf einem solchen Verfahren können neuartige Interaktionsschnittstellen entwickelt werden. Allerdings ist die Ausgestaltung solcher Schnittstellen keinesfalls trivial. Im Gegenteil ist bei einer neuen Art der Interaktion meist sogar die Entwicklung neuer Interaktionstechniken erforderlich, damit überhaupt effiziente und gut bedienbare Schnittstellen konzipiert werden können. Aus diesem Grund wurden in dieser Arbeit einige neue Interaktionstechniken entwickelt und untersucht, die vorhandene Probleme beheben und die Anwendbarkeit eines solchen Eingabeinstruments für bestimmte Arten der Interaktion verbessern oder überhaupt erst ermöglichen. Diese Techniken sind nicht auf dieses Eingabeinstrument beschränkt und können durchaus auch die Handhabung anderer Eingabegeräte verbessern. Des Weiteren werden mehrere neue Interaktionsschnittstellen präsentiert, die den möglichen Einsatz bloßhändiger Interaktion in verschiedenen, typischen Anwendungsgebieten veranschaulichen

    The "Federica" hand: a simple, very efficient prothesis

    Get PDF
    Hand prostheses partially restore hand appearance and functionalities. Not everyone can afford expensive prostheses and many low-cost prostheses have been proposed. In particular, 3D printers have provided great opportunities by simplifying the manufacturing process and reducing costs. Generally, active prostheses use multiple motors for fingers movement and are controlled by electromyographic (EMG) signals. The "Federica" hand is a single motor prosthesis, equipped with an adaptive grasp and controlled by a force-myographic signal. The "Federica" hand is 3D printed and has an anthropomorphic morphology with five fingers, each consisting of three phalanges. The movement generated by a single servomotor is transmitted to the fingers by inextensible tendons that form a closed chain; practically, no springs are used for passive hand opening. A differential mechanical system simultaneously distributes the motor force in predefined portions on each finger, regardless of their actual positions. Proportional control of hand closure is achieved by measuring the contraction of residual limb muscles by means of a force sensor, replacing the EMG. The electrical current of the servomotor is monitored to provide the user with a sensory feedback of the grip force, through a small vibration motor. A simple Arduino board was adopted as processing unit. The differential mechanism guarantees an efficient transfer of mechanical energy from the motor to the fingers and a secure grasp of any object, regardless of its shape and deformability. The force sensor, being extremely thin, can be easily embedded into the prosthesis socket and positioned on both muscles and tendons; it offers some advantages over the EMG as it does not require any electrical contact or signal processing to extract information about the muscle contraction intensity. The grip speed is high enough to allow the user to grab objects on the fly: from the muscle trigger until to the complete hand closure, "Federica" takes about half a second. The cost of the device is about 100 US$. Preliminary tests carried out on a patient with transcarpal amputation, showed high performances in controlling the prosthesis, after a very rapid training session. The "Federica" hand turned out to be a lightweight, low-cost and extremely efficient prosthesis. The project is intended to be open-source: all the information needed to produce the prosthesis (e.g. CAD files, circuit schematics, software) can be downloaded from a public repository. Thus, allowing everyone to use the "Federica" hand and customize or improve it

    IRIS Hand: Smart Robotic Prosthesis

    Get PDF
    This project involved the design and development of an operational first prototype for the IRIS platform – an anthropomorphic robotic hand capable of autonomously determining the shape of an object and selecting the most appropriate method for grabbing said object. Autonomy of the device is achieved through the use of a unique control system which takes input from sensors embedded in the hand to determine the shape of an object, the position of each finger, grip strength, and the quality of grip. The intended use for this technology is in the medical field as a prosthesis. The advantage of our system as a prosthesis is that its autonomous functions allow the user to access a wide variety of functionality more quickly and easily than similar, commercially available products

    Video segmentation by level set.

    Get PDF
    Master'sMASTER OF ENGINEERIN

    Describing Common Human Visual Actions in Images

    Get PDF
    Which common human actions and interactions are recognizable in monocular still images? Which involve objects and/or other people? How many is a person performing at a time? We address these questions by exploring the actions and interactions that are detectable in the images of the MS COCO dataset. We make two main contributions. First, a list of 140 common `visual actions', obtained by analyzing the largest on-line verb lexicon currently available for English (VerbNet) and human sentences used to describe images in MS COCO. Second, a complete set of annotations for those `visual actions', composed of subject-object and associated verb, which we call COCO-a (a for `actions'). COCO-a is larger than existing action datasets in terms of number of actions and instances of these actions, and is unique because it is data-driven, rather than experimenter-biased. Other unique features are that it is exhaustive, and that all subjects and objects are localized. A statistical analysis of the accuracy of our annotations and of each action, interaction and subject-object combination is provided

    Feedback-Based Gameplay Metrics and Gameplay Performance Segmentation: An audio-visual approach for assessing player experience.

    Get PDF
    Gameplay metrics is a method and approach that is growing in popularity amongst the game studies research community for its capacity to assess players’ engagement with game systems. Yet, little has been done, to date, to quantify players’ responses to feedback employed by games that conveys information to players, i.e., their audio-visual streams. The present thesis introduces a novel approach to player experience assessment - termed feedback-based gameplay metrics - which seeks to gather gameplay metrics from the audio-visual feedback streams presented to the player during play. So far, gameplay metrics - quantitative data about a game state and the player's interaction with the game system - are directly logged via the game's source code. The need to utilise source code restricts the range of games that researchers can analyse. By using computer science algorithms for audio-visual processing, yet to be employed for processing gameplay footage, the present thesis seeks to extract similar metrics through the audio-visual streams, thus circumventing the need for access to, whilst also proposing a method that focuses on describing the way gameplay information is broadcast to the player during play. In order to operationalise feedback-based gameplay metrics, the present thesis introduces the concept of gameplay performance segmentation which describes how coherent segments of play can be identified and extracted from lengthy game play sessions. Moreover, in order to both contextualise the method for processing metrics and provide a conceptual framework for analysing the results of a feedback-based gameplay metric segmentation, a multi-layered architecture based on five gameplay concepts (system, game world instance, spatial-temporal, degree of freedom and interaction) is also introduced. Finally, based on data gathered from game play sessions with participants, the present thesis discusses the validity of feedback-based gameplay metrics, gameplay performance segmentation and the multi-layered architecture. A software system has also been specifically developed to produce gameplay summaries based on feedback-based gameplay metrics, and examples of summaries (based on several games) are presented and analysed. The present thesis also demonstrates that feedback-based gameplay metrics can be conjointly analysed with other forms of data (such as biometry) in order to build a more complete picture of game play experience. Feedback based game-play metrics constitutes a post-processing approach that allows the researcher or analyst to explore the data however they wish and as many times as they wish. The method is also able to process any audio-visual file, and can therefore process material from a range of audio-visual sources. This novel methodology brings together game studies and computer sciences by extending the range of games that can now be researched but also to provide a viable solution accounting for the exact way players experience games
    corecore