309 research outputs found

    Bridging semantic gap: learning and integrating semantics for content-based retrieval

    Full text link
    Digital cameras have entered ordinary homes and produced^incredibly large number of photos. As a typical example of broad image domain, unconstrained consumer photos vary significantly. Unlike professional or domain-specific images, the objects in the photos are ill-posed, occluded, and cluttered with poor lighting, focus, and exposure. Content-based image retrieval research has yet to bridge the semantic gap between computable low-level information and high-level user interpretation. In this thesis, we address the issue of semantic gap with a structured learning framework to allow modular extraction of visual semantics. Semantic image regions (e.g. face, building, sky etc) are learned statistically, detected directly from image without segmentation, reconciled across multiple scales, and aggregated spatially to form compact semantic index. To circumvent the ambiguity and subjectivity in a query, a new query method that allows spatial arrangement of visual semantics is proposed. A query is represented as a disjunctive normal form of visual query terms and processed using fuzzy set operators. A drawback of supervised learning is the manual labeling of regions as training samples. In this thesis, a new learning framework to discover local semantic patterns and to generate their samples for training with minimal human intervention has been developed. The discovered patterns can be visualized and used in semantic indexing. In addition, three new class-based indexing schemes are explored. The winnertake- all scheme supports class-based image retrieval. The class relative scheme and the local classification scheme compute inter-class memberships and local class patterns as indexes for similarity matching respectively. A Bayesian formulation is proposed to unify local and global indexes in image comparison and ranking that resulted in superior image retrieval performance over those of single indexes. Query-by-example experiments on 2400 consumer photos with 16 semantic queries show that the proposed approaches have significantly better (18% to 55%) average precisions than a high-dimension feature fusion approach. The thesis has paved two promising research directions, namely the semantics design approach and the semantics discovery approach. They form elegant dual frameworks that exploits pattern classifiers in learning and integrating local and global image semantics

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    The doctoral research abstracts. Vol:7 2015 / Institute of Graduate Studies, UiTM

    Get PDF
    Foreword: The Seventh Issue of The Doctoral Research Abstracts captures the novelty of 65 doctorates receiving their scrolls in UiTM’s 82nd Convocation in the field of Science and Technology, Business and Administration, and Social Science and Humanities. To the recipients I would like to say that you have most certainly done UiTM proud by journeying through the scholastic path with its endless challenges and impediments, and persevering right till the very end. This convocation should not be regarded as the end of your highest scholarly achievement and contribution to the body of knowledge but rather as the beginning of embarking into high impact innovative research for the community and country from knowledge gained during this academic journey. As alumni of UiTM, we will always hold you dear to our hearts. A new ‘handshake’ is about to take place between you and UiTM as joint collaborators in future research undertakings. I envisioned a strong research pact between you as our alumni and UiTM in breaking the frontier of knowledge through research. I wish you all the best in your endeavour and may I offer my congratulations to all the graduands. ‘UiTM sentiasa dihati ku’ / Tan Sri Dato’ Sri Prof Ir Dr Sahol Hamid Abu Bakar , FASc, PEng Vice Chancellor Universiti Teknologi MAR

    Prometheus: a generic e-commerce crawler for the study of business markets and other e-commerce problems

    Get PDF
    Dissertação de mestrado em Computer ScienceThe continuous social and economic development has led over time to an increase in consumption, as well as greater demand from the consumer for better and cheaper products. Hence, the selling price of a product assumes a fundamental role in the purchase decision by the consumer. In this context, online stores must carefully analyse and define the best price for each product, based on several factors such as production/acquisition cost, positioning of the product (e.g. anchor product) and the competition companies strategy. The work done by market analysts changed drastically over the last years. As the number of Web sites increases exponentially, the number of E-commerce web sites also prosperous. Web page classification becomes more important in fields like Web mining and information retrieval. The traditional classifiers are usually hand-crafted and non-adaptive, that makes them inappropriate to use in a broader context. We introduce an ensemble of methods and the posterior study of its results to create a more generic and modular crawler and scraper for detection and information extraction on E-commerce web pages. The collected information may then be processed and used in the pricing decision. This framework goes by the name Prometheus and has the goal of extracting knowledge from E-commerce Web sites. The process requires crawling an online store and gathering product pages. This implies that given a web page the framework must be able to determine if it is a product page. In order to achieve this we classify the pages in three categories: catalogue, product and ”spam”. The page classification stage was addressed based on the html text as well as on the visual layout, featuring both traditional methods and Deep Learning approaches. Once a set of product pages has been identified we proceed to the extraction of the pricing information. This is not a trivial task due to the disparity of approaches to create a web page. Furthermore, most product pages are dynamic in the sense that they are truly a page for a family of related products. For instance, when visiting a shoe store, for a particular model there are probably a number of sizes and colours available. Such a model may be displayed in a single dynamic web page making it necessary for our framework to explore all the relevant combinations. This process is called scraping and is the last stage of the Prometheus framework.O contínuo desenvolvimento social e económico tem conduzido ao longo do tempo a um aumento do consumo, assim como a uma maior exigência do consumidor por produtos melhores e mais baratos. Naturalmente, o preço de venda de um produto assume um papel fundamental na decisão de compra por parte de um consumidor. Nesse sentido, as lojas online precisam de analisar e definir qual o melhor preço para cada produto, tendo como base diversos fatores, tais como o custo de produção/venda, posicionamento do produto (e.g. produto âncora) e as próprias estratégias das empresas concorrentes. O trabalho dos analistas de mercado mudou drasticamente nos últimos anos. O crescimento de sites na Web tem sido exponencial, o número de sites E-commerce também tem prosperado. A classificação de páginas da Web torna-se cada vez mais importante, especialmente em campos como mineração de dados na Web e coleta/extração de informações. Os classificadores tradicionais são geralmente feitos manualmente e não adaptativos, o que os torna inadequados num contexto mais amplo. Nós introduzimos um conjunto de métodos e o estudo posterior dos seus resultados para criar um crawler e scraper mais genéricos e modulares para extração de conhecimento em páginas de Ecommerce. A informação recolhida pode então ser processada e utilizada na tomada de decisão sobre o preço de venda. Esta Framework chama-se Prometheus e tem como intuito extrair conhecimento de Web sites de E-commerce. Este processo necessita realizar a navegação sobre lojas online e armazenar páginas de produto. Isto implica que dado uma página web a framework seja capaz de determinar se é uma página de produto. Para atingir este objetivo nós classificamos as páginas em três categorias: catálogo, produto e spam. A classificação das páginas foi realizada tendo em conta o html e o aspeto visual das páginas, utilizando tanto métodos tradicionais como Deep Learning. Depois de identificar um conjunto de páginas de produto procedemos à extração de informação sobre o preço. Este processo não é trivial devido à quantidade de abordagens possíveis para criar uma página web. A maioria dos produtos são dinâmicos no sentido em que um produto é na realidade uma família de produtos relacionados. Por exemplo, quando visitamos uma loja online de sapatos, para um modelo em especifico existe a provavelmente um conjunto de tamanhos e cores disponíveis. Esse modelo pode ser apresentado numa única página dinâmica fazendo com que seja necessário para a nossa Framework explorar estas combinações relevantes. Este processo é chamado de scraping e é o último passo da Framework Prometheus

    From information to imagination: multivalent logic and system creation in personal knowledge management

    Get PDF
    [Extract] What does personal knowledge management mean for the way that we think, create, write and muse? How does it impact and alter the process of creation? Inversely, how do the media of creation - intuition, pattern recognition, visualization, improvisation, paradoxical thought and synchronicity - shape the way that we manage personal digital libraries? The following explores the role of personal knowledge management in bridging between the shallows of our data streams and the depths of our creative imagination

    A Survey on Extreme Multi-label Learning

    Full text link
    Multi-label learning has attracted significant attention from both academic and industry field in recent decades. Although existing multi-label learning algorithms achieved good performance in various tasks, they implicitly assume the size of target label space is not huge, which can be restrictive for real-world scenarios. Moreover, it is infeasible to directly adapt them to extremely large label space because of the compute and memory overhead. Therefore, eXtreme Multi-label Learning (XML) is becoming an important task and many effective approaches are proposed. To fully understand XML, we conduct a survey study in this paper. We first clarify a formal definition for XML from the perspective of supervised learning. Then, based on different model architectures and challenges of the problem, we provide a thorough discussion of the advantages and disadvantages of each category of methods. For the benefit of conducting empirical studies, we collect abundant resources regarding XML, including code implementations, and useful tools. Lastly, we propose possible research directions in XML, such as new evaluation metrics, the tail label problem, and weakly supervised XML.Comment: A preliminary versio

    Understanding comparative questions and retrieving argumentative answers

    Get PDF
    Making decisions is an integral part of everyday life, yet it can be a difficult and complex process. While peoples’ wants and needs are unlimited, resources are often scarce, making it necessary to research the possible alternatives and weigh the pros and cons before making a decision. Nowadays, the Internet has become the main source of information when it comes to comparing alternatives, making search engines the primary means for collecting new information. However, relying only on term matching is not sufficient to adequately address requests for comparisons. Therefore, search systems should go beyond this approach to effectively address comparative information needs. In this dissertation, I explore from different perspectives how search systems can respond to comparative questions. First, I examine approaches to identifying comparative questions and study their underlying information needs. Second, I investigate a methodology to identify important constituents of comparative questions like the to-be-compared options and to detect the stance of answers towards these comparison options. Then, I address ambiguous comparative search queries by studying an interactive clarification search interface. And finally, addressing answering comparative questions, I investigate retrieval approaches that consider not only the topical relevance of potential answers but also account for the presence of arguments towards the comparison options mentioned in the questions. By addressing these facets, I aim to provide a comprehensive understanding of how to effectively satisfy the information needs of searchers seeking to compare different alternatives

    Unsupervised video indexing on audiovisual characterization of persons

    Get PDF
    Cette thèse consiste à proposer une méthode de caractérisation non-supervisée des intervenants dans les documents audiovisuels, en exploitant des données liées à leur apparence physique et à leur voix. De manière générale, les méthodes d'identification automatique, que ce soit en vidéo ou en audio, nécessitent une quantité importante de connaissances a priori sur le contenu. Dans ce travail, le but est d'étudier les deux modes de façon corrélée et d'exploiter leur propriété respective de manière collaborative et robuste, afin de produire un résultat fiable aussi indépendant que possible de toute connaissance a priori. Plus particulièrement, nous avons étudié les caractéristiques du flux audio et nous avons proposé plusieurs méthodes pour la segmentation et le regroupement en locuteurs que nous avons évaluées dans le cadre d'une campagne d'évaluation. Ensuite, nous avons mené une étude approfondie sur les descripteurs visuels (visage, costume) qui nous ont servis à proposer de nouvelles approches pour la détection, le suivi et le regroupement des personnes. Enfin, le travail s'est focalisé sur la fusion des données audio et vidéo en proposant une approche basée sur le calcul d'une matrice de cooccurrence qui nous a permis d'établir une association entre l'index audio et l'index vidéo et d'effectuer leur correction. Nous pouvons ainsi produire un modèle audiovisuel dynamique des intervenants.This thesis consists to propose a method for an unsupervised characterization of persons within audiovisual documents, by exploring the data related for their physical appearance and their voice. From a general manner, the automatic recognition methods, either in video or audio, need a huge amount of a priori knowledge about their content. In this work, the goal is to study the two modes in a correlated way and to explore their properties in a collaborative and robust way, in order to produce a reliable result as independent as possible from any a priori knowledge. More particularly, we have studied the characteristics of the audio stream and we have proposed many methods for speaker segmentation and clustering and that we have evaluated in a french competition. Then, we have carried a deep study on visual descriptors (face, clothing) that helped us to propose novel approches for detecting, tracking, and clustering of people within the document. Finally, the work was focused on the audiovisual fusion by proposing a method based on computing the cooccurrence matrix that allowed us to establish an association between audio and video indexes, and to correct them. That will enable us to produce a dynamic audiovisual model for each speaker
    corecore