4 research outputs found

    Gesture based interface for image annotation

    Get PDF
    Dissertação apresentada para obtenção do Grau de Mestre em Engenharia Informática pela Universidade Nova de Lisboa, Faculdade de Ciências e TecnologiaGiven the complexity of visual information, multimedia content search presents more problems than textual search. This level of complexity is related with the difficulty of doing automatic image and video tagging, using a set of keywords to describe the content. Generally, this annotation is performed manually (e.g., Google Image) and the search is based on pre-defined keywords. However, this task takes time and can be dull. In this dissertation project the objective is to define and implement a game to annotate personal digital photos with a semi-automatic system. The game engine tags images automatically and the player role is to contribute with correct annotations. The application is composed by the following main modules: a module for automatic image annotation, a module that manages the game graphical interface (showing images and tags), a module for the game engine and a module for human interaction. The interaction is made with a pre-defined set of gestures, using a web camera. These gestures will be detected using computer vision techniques interpreted as the user actions. The dissertation also presents a detailed analysis of this application, computational modules and design, as well as a series of usability tests

    A game-based approach towards human augmented image annotation.

    Get PDF
    PhDImage annotation is a difficult task to achieve in an automated way. In this thesis, a human-augmented approach to tackle this problem is discussed and suitable strategies are derived to solve it. The proposed technique is inspired by human-based computation in what is called “human-augmented” processing to overcome limitations of fully automated technology for closing the semantic gap. The approach aims to exploit what millions of individual gamers are keen to do, i.e. enjoy computer games, while annotating media. In this thesis, the image annotation problem is tackled by a game based framework. This approach combines image processing and a game theoretic model to gather media annotations. Although the proposed model behaves similar to a single player game model, the underlying approach has been designed based on a two-player model which exploits the player’s contribution to the game and previously recorded players to improve annotations accuracy. In addition, the proposed framework is designed to predict the player’s intention through Markovian and Sequential Sampling inferences in order to detect cheating and improve annotation performances. Finally, the proposed techniques are comprehensively evaluated with three different image datasets and selected representative results are reported

    Implicit image annotation by using gaze analysis

    Get PDF
    PhDThanks to the advances in technology, people are storing a massive amount of visual information in the online databases. Today it is normal for a person to take a photo of an event with their smartphone and effortlessly upload it to a host domain. For later quick access, this enormous amount of data needs to be indexed by providing metadata for their content. The challenge is to provide suitable captions for the semantics of the visual content. This thesis investigates the possibility of extracting and using the valuable information stored inside human’s eye movements when interacting with digital visual content in order to provide information for image annotation implicitly. A non-intrusive framework is developed which is capable of inferring gaze movements to classify the visited images by a user into two classes when the user is searching for a Target Concept (TC) in the images. The first class is formed of the images that contain the TC and it is called the TC+ class and the second class is formed of the images that do not contain the TC and it is called the TC- class. By analysing the eye-movements only, the developed framework was able to identify over 65% of the images that the subject users were searching for with the accuracy over 75%. This thesis shows that the existing information in gaze patterns can be employed to improve the machine’s judgement of image content by assessment of human attention to the objects inside virtual environments.European Commission funded Network of Excellence PetaMedi
    corecore