3 research outputs found

    High-throughput visual knowledge analysis and retrieval in big data ecosystems

    Get PDF
    Visual knowledge plays an important role in many highly skilled applications, such as medical diagnosis, geospatial image analysis and pathology diagnosis. Medical practitioners are able to interpret and reason about diagnostic images based on not only primitive-level image features such as color, texture, and spatial distribution but also their experience and tacit knowledge which are seldom articulated explicitly. This reasoning process is dynamic and closely related to real-time human cognition. Due to a lack of visual knowledge management and sharing tools, it is difficult to capture and transfer such tacit and hard-won expertise to novices. Moreover, many mission-critical applications require the ability to process such tacit visual knowledge in real time. Precisely how to index this visual knowledge computationally and systematically still poses a challenge to the computing community. My dissertation research results in novel computational approaches for high-throughput visual knowledge analysis and retrieval from large-scale databases using latest technologies in big data ecosystems. To provide a better understanding of visual reasoning, human gaze patterns are qualitatively measured spatially and temporally to model observers' cognitive process. These gaze patterns are then indexed in a NoSQL distributed database as a visual knowledge repository, which is accessed using various unique retrieval methods developed through this dissertation work. To provide meaningful retrievals in real time, deep-learning methods for automatic annotation of visual activities and streaming similarity comparisons are developed under a gaze-streaming framework using Apache Spark. This research has several potential applications that offer a broader impact among the scientific community and in the practical world. First, the proposed framework can be adapted for different domains, such as fine arts, life sciences, etc. with minimal effort to capture human reasoning processes. Second, with its real-time visual knowledge search function, this framework can be used for training novices in the interpretation of domain images, by helping them learn experts' reasoning processes. Third, by helping researchers to understand human visual reasoning, it may shed light on human semantics modeling. Finally, integrating reasoning process with multimedia data, future retrieval of media could embed human perceptual reasoning for database search beyond traditional content-based media retrievals

    Sistema de recomendação de imagens baseado em atenção visual

    Get PDF
    Nowadays, the amount of users using e-commerce sites for shopping is greatly increasing, mainly due to the easiness and rapidity of this way of consumption. Many e-commerce sites, differently from physical stores, can offer their users a wide range of products and services, and the users can find it very difficult to find products of your preference. Typically, your preference for a product can be influenced by the visual appearance of the product image. In this context, Image Recommendation Systems (IRS) have become indispensable to help users to find products that may possibly pleasant or be useful to them. Generally, IRS use past behavior of users (clicks, purchases, reviews, ratings, etc.) and/or attributes of the products to define the preferences of users. One of the main challenges faced by IRS is the need of the user to provide some information about his / her preferences on products in order to get further recommendations from the system. Unfortunately, users are not always willing to provide such information explicitly. So, in order to cope with this challenge, methods for obtaining user’s implicit feedback are desirable. In this work, the author propose an investigation to discover to which extent information concerning user visual attention can help improve the rating prediction hence produce more accurate IRS. This work proposes to develop two new methods, a method based on Collaborative Filtering (CF) which combines ratings and data visual attention to represent the past behavior of users, and another method based on the content of the items, which combines textual attributes, visual features and visual attention data to compose the profile of the items. The proposed methods were evaluated in a painting dataset and a clothing dataset. The experimental results show significant improvements in rating prediction and precision in recommendation when compared to the state-of-the-art. It is worth mentioning that the proposed techniques are flexible and can be applied in other scenarios that exploits the visual attention of the recommended items.Conselho Nacional de Desenvolvimento Científico e TecnológicoTese (Doutorado)Hoje em dia, a quantidade de usuários que utilizam sites de comércio eletrônico para realizar compras está aumentando muito, principalmente devido à facilidade e rapidez. Muitos sites de comércio eletrônico, diferentemente das lojas físicas, disponibilizam aos seus usuários uma grande variedade de produtos e serviços, e os usuários podem ter muita dificuldade em encontrar produtos de sua preferência. Normalmente, a preferência por um produto pode ser influenciada pela aparência visual da imagem do produto. Neste contexto, os Sistemas de Recomendação de produtos que estão associados a Imagens (SRI) tornaram-se indispensáveis para ajudar os usuários a encontrar produtos que podem ser, eventualmente, agradáveis ou úteis para eles. Geralmente, os SRI usam o comportamento passado dos usuários (cliques, compras, críticas, avaliações, etc.) e/ou atributos de produtos para definirem as preferências dos usuários. Um dos principais desafios enfrentados em SRI é a necessidade de o usuário fornecer algumas informações sobre suas preferências sobre os produtos, a fim de obter novas recomendações do sistema. Infelizmente, os usuários nem sempre estão dispostos a fornecer tais informações de forma explícita. Assim, a fim de lidar com esse desafio, os métodos para obtenção de informações de forma implícita do usuário são desejáveis. Neste trabalho, propõe-se investigar em que medida informações sobre atenção visual do usuário podem ajudar a melhorar a predição de avaliação e consequentemente produzir SRI mais precisos. É também objetivo deste trabalho o desenvolvimento de dois novos métodos, um método baseado em Filtragem Colaborativa (FC) que combina avaliações e dados de atenção visual para representar o comportamento passado dos usuários, e outro método baseado no conteúdo dos itens, que combina atributos textuais, características visuais e dados de atenção visual para compor o perfil dos itens. Os métodos propostos foram avaliados em uma base de imagens de pinturas e uma base de imagens de roupas. Os resultados experimentais mostram que os métodos propostos neste trabalho possuem ganhos significativos em predição de avaliação e precisão na recomendação quando comparados ao estado-da-arte. Vale ressaltar que as técnicas propostas são flexíveis, podendo ser utilizadas em outros cenários que exploram a atenção visual dos itens recomendados

    Interactive video retrieval using implicit user feedback.

    Get PDF
    PhDIn the recent years, the rapid development of digital technologies and the low cost of recording media have led to a great increase in the availability of multimedia content worldwide. This availability places the demand for the development of advanced search engines. Traditionally, manual annotation of video was one of the usual practices to support retrieval. However, the vast amounts of multimedia content make such practices very expensive in terms of human effort. At the same time, the availability of low cost wearable sensors delivers a plethora of user-machine interaction data. Therefore, there is an important challenge of exploiting implicit user feedback (such as user navigation patterns and eye movements) during interactive multimedia retrieval sessions with a view to improving video search engines. In this thesis, we focus on automatically annotating video content by exploiting aggregated implicit feedback of past users expressed as click-through data and gaze movements. Towards this goal, we have conducted interactive video retrieval experiments, in order to collect click-through and eye movement data in not strictly controlled environments. First, we generate semantic relations between the multimedia items by proposing a graph representation of aggregated past interaction data and exploit them to generate recommendations, as well as to improve content-based search. Then, we investigate the role of user gaze movements in interactive video retrieval and propose a methodology for inferring user interest by employing support vector machines and gaze movement-based features. Finally, we propose an automatic video annotation framework, which combines query clustering into topics by constructing gaze movement-driven random forests and temporally enhanced dominant sets, as well as video shot classification for predicting the relevance of viewed items with respect to a topic. The results show that exploiting heterogeneous implicit feedback from past users is of added value for future users of interactive video retrieval systems
    corecore