893 research outputs found

    Online annotations tools for micro-level human behavior labeling on videos

    Get PDF
    Abstract. Successful machine learning and computer vision approach generally require significant amounts of annotated data for learning. These methods including identification, retrieval, classification of events, and analysis of human behavior from a video. Micro-level human behavior analysis usually requires laborious efforts for obtaining the precise labels. As the quantity of online video grows, the crowdsourcing approach provides a method for workers without a professional background to complete the annotation task. These workers require training to understand implicit knowledge of human behavior. The motivation of this study was to enhance the interaction between annotation workers for training purposes. By observing experienced local researchers in Oulu, the key problem with annotation is the precision of the results. The goal of this study was to provide training tools for people to improve the label quality, it illustrates the importance of training. In this study, a new annotation tool was developed to test workers’ performance in reviewing other annotations. This tool filters very noisy input by comment and vote feature. The result indicated that users were more likely to annotate micro behavior and time that refer to other opinions, and it was a more effective and reliable way to train. Besides, this study reported the development process with React and Firebase, it emphasized the use of more Web resources and tools to develop annotation tools

    Augmenting the performance of image similarity search through crowdsourcing

    Get PDF
    Crowdsourcing is defined as “outsourcing a task that is traditionally performed by an employee to a large group of people in the form of an open call” (Howe 2006). Many platforms designed to perform several types of crowdsourcing and studies have shown that results produced by crowds in crowdsourcing platforms are generally accurate and reliable. Crowdsourcing can provide a fast and efficient way to use the power of human computation to solve problems that are difficult for machines to perform. From several different microtasking crowdsourcing platforms available, we decided to perform our study using Amazon Mechanical Turk. In the context of our research we studied the effect of user interface design and its corresponding cognitive load on the performance of crowd-produced results. Our results highlighted the importance of a well-designed user interface on crowdsourcing performance. Using crowdsourcing platforms such as Amazon Mechanical Turk, we can utilize humans to solve problems that are difficult for computers, such as image similarity search. However, in tasks like image similarity search, it is more efficient to design a hybrid human–machine system. In the context of our research, we studied the effect of involving the crowd on the performance of an image similarity search system and proposed a hybrid human–machine image similarity search system. Our proposed system uses machine power to perform heavy computations and to search for similar images within the image dataset and uses crowdsourcing to refine results. We designed our content-based image retrieval (CBIR) system using SIFT, SURF, SURF128 and ORB feature detector/descriptors and compared the performance of the system using each feature detector/descriptor. Our experiment confirmed that crowdsourcing can dramatically improve the CBIR system performance

    Enhancing the use of online 3d multimedia content through the analysis of user interactions

    Get PDF
    De plus en plus de contenus 3D interactifs sont disponibles sur la toile. Visualiser et manipuler ces contenus 3D en temps réel, de façon naturelle et intuitive, devient donc une nécessité. Les applications visées sont nombreuses : le e-commerce, l'éducation et la formation en ligne, la conception, ou l'architecture dans le contexte par exemple de musées virtuels ou de communautés virtuelles. L'utilisation de contenus 3D en ligne ne propose pas de remplacer les contenus traditionnels, tels que les textes, les images ou les vidéos, mais plutôt d'utiliser la 3D en complément, pour enrichir ces contenus. La toile est désormais une plate-forme où les contenus hypertexte, hypermédia, et 3D sont simultanément disponibles pour les utilisateurs. Cette utilisation des contenus 3D pose cependant deux questions principales. Tout d'abord, les interactions 3D sont souvent lourdes puisqu'elles comprennent de nombreux degrés de liberté; la navigation dans les contenus 3D peut s'en trouver inefficace et lente. Nous abordons ce problème en proposant un nouveau paradigme basé sur l'analyse des interactions (crowdsourcing). En analysant les interactions d'utilisateurs 3D, nous identifions des régions d'intérêt (ROI), et générons des recommandations pour les utilisateurs suivants. Ces recommandations permettent à la fois de réduire le temps d'interaction pour identifier une ROI d'un objet 3D et également de simplifier les interactions 3D nécessaires. De plus, les scènes ou objets 3D contiennent une information visuelle riche. Les sites Web traditionnels contiennent, eux, principalement des informations descriptives (textuelles) ainsi que des hyperliens pour permettre la navigation. Des sites contenants d'une part de l'information textuelle, et d'autre part de l'information 3D peuvent s'avérer difficile à appréhender pour les utilisateurs. Pour permettre une navigation cohérente entre les informations 3D et textuelles, nous proposons d'utiliser le crowdsourcing pour la construction d'associations sémantiques entre le texte et la visualisation en 3D. Les liens produits sont proposés aux utilisateurs suivants pour naviguer facilement vers un point de vue d'un objet 3D associé à un contenu textuel. Nous évaluons ces deux méthodes par des études expérimentales. Les évaluations montrent que les recommandations réduisent le temps d'interaction 3D. En outre, les utilisateurs apprécient l'association sémantique proposée, c'est-à-dire, une majorité d'utilisateurs indique que les recommandations ont été utiles pour eux, et préfèrent la navigation en 3D proposée qui consiste à utiliser les liens sémantiques ainsi que la souris par rapport à des interactions utilisant seulement la souris. ABSTRACT : Recent years have seen the development of interactive 3D graphics on the Web. The ability to visualize and manipulate 3D content in real time seems to be the next evolution of the Web for a wide number of application areas such as e-commerce, education and training, architecture design, virtual museums and virtual communities. The use of online 3D graphics in these application domains does not mean to substitute traditional web content of texts, images and videos, but rather acts as a complement for it. The Web is now a platform where hypertext, hypermedia, and 3D graphics are simultaneously available to users. This use of online 3D graphics, however, poses two main issues. First, since 3D interactions are cumbersome as they provide numerous degrees of freedom, 3D browsing may be inefficient. We tackle this problem by proposing a new paradigm based on crowdsourcing to ease online 3D interactions, that consists of analyzing 3D user interactions to identify Regions of Interest (ROIs), and generating recommendations to subsequent users. The recommendations both reduce 3D browsing time and simplify 3D interactions. Second, 3D graphics contain purely rich visual information of the concepts. On the other hand, traditional websites mainly contain descriptive information (text) with hyperlinks as navigation means. The problem is that viewing and interacting with the websites that use two very different mediums (hypertext and 3D graphics) may be complicated for users. To address this issue, we propose to use crowdsourcing for building semantic associations between texts and 3D visualizations. The produced links are suggested to upcoming users so that they can readily locate 3D visualization associated with a textual content. We evaluate the proposed methods with experimental user studies. The evaluations show that the recommendations reduce 3D interaction time. Moreover, the results from the user study showed that our proposed semantic association is appreciated by users, that is, a majority of users assess that recommendations were helpful for them, and browsing 3D objects using both mouse interactions and the proposed links is preferred compared to having only mouse interactions

    Use Case Oriented Medical Visual Information Retrieval & System Evaluation

    Get PDF
    Large amounts of medical visual data are produced daily in hospitals, while new imaging techniques continue to emerge. In addition, many images are made available continuously via publications in the scientific literature and can also be valuable for clinical routine, research and education. Information retrieval systems are useful tools to provide access to the biomedical literature and fulfil the information needs of medical professionals. The tools developed in this thesis can potentially help clinicians make decisions about difficult diagnoses via a case-based retrieval system based on a use case associated with a specific evaluation task. This system retrieves articles from the biomedical literature when querying with a case description and attached images. This thesis proposes a multimodal approach for medical case-based retrieval with focus on the integration of visual information connected to text. Furthermore, the ImageCLEFmed evaluation campaign was organised during this thesis promoting medical retrieval system evaluation
    • …