6 research outputs found

    Research on Image Retrieval Optimization Based on Eye Movement Experiment Data

    Get PDF
    Satisfying a user's actual underlying needs in the image retrieval process is a difficult challenge facing image retrieval technology. The aim of this study is to improve the performance of a retrieval system and provide users with optimized search results using the feedback of eye movement. We analyzed the eye movement signals of the user’s image retrieval process from cognitive and mathematical perspectives. Data collected for 25 designers in eye tracking experiments were used to train and evaluate the model. In statistical analysis, eight eye movement features were statistically significantly different between selected and unselected groups of images (p < 0.05). An optimal selection of input features resulted in overall accuracy of the support vector machine prediction model of 87.16%. Judging the user’s requirements in the image retrieval process through eye movement behaviors was shown to be effective

    Relevance Prediction from Eye-movements Using Semi-interpretable Convolutional Neural Networks

    Full text link
    We propose an image-classification method to predict the perceived-relevance of text documents from eye-movements. An eye-tracking study was conducted where participants read short news articles, and rated them as relevant or irrelevant for answering a trigger question. We encode participants' eye-movement scanpaths as images, and then train a convolutional neural network classifier using these scanpath images. The trained classifier is used to predict participants' perceived-relevance of news articles from the corresponding scanpath images. This method is content-independent, as the classifier does not require knowledge of the screen-content, or the user's information-task. Even with little data, the image classifier can predict perceived-relevance with up to 80% accuracy. When compared to similar eye-tracking studies from the literature, this scanpath image classification method outperforms previously reported metrics by appreciable margins. We also attempt to interpret how the image classifier differentiates between scanpaths on relevant and irrelevant documents

    Implicit image annotation by using gaze analysis

    Get PDF
    PhDThanks to the advances in technology, people are storing a massive amount of visual information in the online databases. Today it is normal for a person to take a photo of an event with their smartphone and effortlessly upload it to a host domain. For later quick access, this enormous amount of data needs to be indexed by providing metadata for their content. The challenge is to provide suitable captions for the semantics of the visual content. This thesis investigates the possibility of extracting and using the valuable information stored inside human’s eye movements when interacting with digital visual content in order to provide information for image annotation implicitly. A non-intrusive framework is developed which is capable of inferring gaze movements to classify the visited images by a user into two classes when the user is searching for a Target Concept (TC) in the images. The first class is formed of the images that contain the TC and it is called the TC+ class and the second class is formed of the images that do not contain the TC and it is called the TC- class. By analysing the eye-movements only, the developed framework was able to identify over 65% of the images that the subject users were searching for with the accuracy over 75%. This thesis shows that the existing information in gaze patterns can be employed to improve the machine’s judgement of image content by assessment of human attention to the objects inside virtual environments.European Commission funded Network of Excellence PetaMedi

    Interactive video retrieval using implicit user feedback.

    Get PDF
    PhDIn the recent years, the rapid development of digital technologies and the low cost of recording media have led to a great increase in the availability of multimedia content worldwide. This availability places the demand for the development of advanced search engines. Traditionally, manual annotation of video was one of the usual practices to support retrieval. However, the vast amounts of multimedia content make such practices very expensive in terms of human effort. At the same time, the availability of low cost wearable sensors delivers a plethora of user-machine interaction data. Therefore, there is an important challenge of exploiting implicit user feedback (such as user navigation patterns and eye movements) during interactive multimedia retrieval sessions with a view to improving video search engines. In this thesis, we focus on automatically annotating video content by exploiting aggregated implicit feedback of past users expressed as click-through data and gaze movements. Towards this goal, we have conducted interactive video retrieval experiments, in order to collect click-through and eye movement data in not strictly controlled environments. First, we generate semantic relations between the multimedia items by proposing a graph representation of aggregated past interaction data and exploit them to generate recommendations, as well as to improve content-based search. Then, we investigate the role of user gaze movements in interactive video retrieval and propose a methodology for inferring user interest by employing support vector machines and gaze movement-based features. Finally, we propose an automatic video annotation framework, which combines query clustering into topics by constructing gaze movement-driven random forests and temporally enhanced dominant sets, as well as video shot classification for predicting the relevance of viewed items with respect to a topic. The results show that exploiting heterogeneous implicit feedback from past users is of added value for future users of interactive video retrieval systems

    Multimodale Interaktion in Multi-Display-Umgebungen

    Get PDF
    Interaktive Umgebungen entwickeln sich mehr und mehr weg von Einzelarbeitsplätzen, hin zu Multi-Display-/Multi-User-Umgebungen. Diese stellen neue Anforderungen an Eingabegeräte und Interaktionstechniken. Im Rahmen dieser Arbeit werden neue Ansätze zur Interaktion auf Basis von Handgesten und Blick als neuartige Eingabemodalitäten entwickelt und untersucht

    Multimodale Interaktion in Multi-Display-Umgebungen

    Get PDF
    Interaktive Umgebungen entwickeln sich mehr und mehr weg von Einzelarbeitsplätzen, hin zu Multi-Display-/Multi-User-Umgebungen. Diese stellen neue Anforderungen an Eingabegeräte und Interaktionstechniken. Im Rahmen dieser Arbeit werden neue Ansätze zur Interaktion auf Basis von Handgesten und Blick als neuartige Eingabemodalitäten entwickelt und untersucht
    corecore