332 research outputs found

    Attribute-based Image Retrieval: Towards Bridging the Semantic and Intention Gaps

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Computing point-of-view : modeling and simulating judgments of taste

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2006.Includes bibliographical references (p. 153-163).People have rich points-of-view that afford them the ability to judge the aesthetics of people, things, and everyday happenstance; yet viewpoint has an ineffable quality that is hard to articulate in words, let alone capture in computer models. Inspired by cultural theories of taste and identity, this thesis explores end-to-end computational modeling of people's tastes-from model acquisition, to generalization, to application- under various realms. Five aesthetical realms are considered-cultural taste, attitudes, ways of perceiving, taste for food, and sense-of-humor. A person's model is acquired by reading her personal texts, such as a weblog diary, a social network profile, or emails. To generalize a person model, methods such as spreading activation, analogy, and imprimer supplementation are applied to semantic resources and search spaces mined from cultural corpora. Once a generalized model is achieved, a person's tastes are brought to life through perspective-based applications, which afford the exploration of someone else's perspective through interactivity and play. The thesis describes model acquisition systems implemented for each of the five aesthetical realms.(cont.) The techniques of 'reading for affective themes' (RATE), and 'culture mining' are described, along with their enabling technologies, which are commonsense reasoning and textual affect analysis. Finally, six perspective-based applications were implemented to illuminate a range of real-world beneficiaries to person modeling-virtual mentoring, self-reflection, and deep customization.by Xinyu Hugo Liu.Ph.D

    PERICLES Deliverable 4.3:Content Semantics and Use Context Analysis Techniques

    Get PDF
    The current deliverable summarises the work conducted within task T4.3 of WP4, focusing on the extraction and the subsequent analysis of semantic information from digital content, which is imperative for its preservability. More specifically, the deliverable defines content semantic information from a visual and textual perspective, explains how this information can be exploited in long-term digital preservation and proposes novel approaches for extracting this information in a scalable manner. Additionally, the deliverable discusses novel techniques for retrieving and analysing the context of use of digital objects. Although this topic has not been extensively studied by existing literature, we believe use context is vital in augmenting the semantic information and maintaining the usability and preservability of the digital objects, as well as their ability to be accurately interpreted as initially intended.PERICLE

    Artificial Intelligence Research Branch future plans

    Get PDF
    This report contains information on the activities of the Artificial Intelligence Research Branch (FIA) at NASA Ames Research Center (ARC) in 1992, as well as planned work in 1993. These activities span a range from basic scientific research through engineering development to fielded NASA applications, particularly those applications that are enabled by basic research carried out in FIA. Work is conducted in-house and through collaborative partners in academia and industry. All of our work has research themes with a dual commitment to technical excellence and applicability to NASA short, medium, and long-term problems. FIA acts as the Agency's lead organization for research aspects of artificial intelligence, working closely with a second research laboratory at the Jet Propulsion Laboratory (JPL) and AI applications groups throughout all NASA centers. This report is organized along three major research themes: (1) Planning and Scheduling: deciding on a sequence of actions to achieve a set of complex goals and determining when to execute those actions and how to allocate resources to carry them out; (2) Machine Learning: techniques for forming theories about natural and man-made phenomena; and for improving the problem-solving performance of computational systems over time; and (3) Research on the acquisition, representation, and utilization of knowledge in support of diagnosis design of engineered systems and analysis of actual systems

    Interactive content-based image retrieval using relevance feedback

    Full text link

    Feature based dynamic intra-video indexing

    Get PDF
    A thesis submitted in partial fulfillment for the degree of Doctor of PhilosophyWith the advent of digital imagery and its wide spread application in all vistas of life, it has become an important component in the world of communication. Video content ranging from broadcast news, sports, personal videos, surveillance, movies and entertainment and similar domains is increasing exponentially in quantity and it is becoming a challenge to retrieve content of interest from the corpora. This has led to an increased interest amongst the researchers to investigate concepts of video structure analysis, feature extraction, content annotation, tagging, video indexing, querying and retrieval to fulfil the requirements. However, most of the previous work is confined within specific domain and constrained by the quality, processing and storage capabilities. This thesis presents a novel framework agglomerating the established approaches from feature extraction to browsing in one system of content based video retrieval. The proposed framework significantly fills the gap identified while satisfying the imposed constraints of processing, storage, quality and retrieval times. The output entails a framework, methodology and prototype application to allow the user to efficiently and effectively retrieved content of interest such as age, gender and activity by specifying the relevant query. Experiments have shown plausible results with an average precision and recall of 0.91 and 0.92 respectively for face detection using Haar wavelets based approach. Precision of age ranges from 0.82 to 0.91 and recall from 0.78 to 0.84. The recognition of gender gives better precision with males (0.89) compared to females while recall gives a higher value with females (0.92). Activity of the subject has been detected using Hough transform and classified using Hiddell Markov Model. A comprehensive dataset to support similar studies has also been developed as part of the research process. A Graphical User Interface (GUI) providing a friendly and intuitive interface has been integrated into the developed system to facilitate the retrieval process. The comparison results of the intraclass correlation coefficient (ICC) shows that the performance of the system closely resembles with that of the human annotator. The performance has been optimised for time and error rate

    A picture is worth a thousand words : content-based image retrieval techniques

    Get PDF
    In my dissertation I investigate techniques for improving the state of the art in content-based image retrieval. To place my work into context, I highlight the current trends and challenges in my field by analyzing over 200 recent articles. Next, I propose a novel paradigm called __artificial imagination__, which gives the retrieval system the power to imagine and think along with the user in terms of what she is looking for. I then introduce a new user interface for visualizing and exploring image collections, empowering the user to navigate large collections based on her own needs and preferences, while simultaneously providing her with an accurate sense of what the database has to offer. In the later chapters I present work dealing with millions of images and focus in particular on high-performance techniques that minimize memory and computational use for both near-duplicate image detection and web search. Finally, I show early work on a scene completion-based image retrieval engine, which synthesizes realistic imagery that matches what the user has in mind.LEI Universiteit LeidenNWOImagin

    Theories of information and uncertainty for the modelling of information retrieval : an application of situation theory and Dempster-Shafer's theory of evidence

    Get PDF
    Current information retrieval models only offer simplistic and specific representations of information. Therefore, there is a need for the development of a new formalism able to model information retrieval systems in a more generic manner. In 1986, Van Rijsbergen suggested that such formalisms can be both appropriately and powerfully defined within a logic. The resulting formalism should capture information as it appears in an information retrieval system, and also in any of its inherent forms. The aim of this thesis is to understand the nature of information in information retrieval, and to propose a logic-based model of an information retrieval system that reflects this nature. The first objective of this thesis is to identify essential features of information in an information retrieval system. These are: 0 flow, 0 intensionality, 0 partiality, 0 structure, 0 significance, and o uncertainty. It is shown that the first four features are qualitative, whereas the last two are quantitative, and that their modelling requires different frameworks: a theory of information, and a theory of uncertainty, respectively. The second objective of this thesis is to determine the appropriate framework for each type of feature, and to develop a method to combine them in a consistent fashion. The combination is based on the Transformation Principle. Many specific attempts have been made to derive an adequate definition of information. The one adopted in this thesis is based on that of Dretske, Barwise, and Devlin who claimed that there is a primitive notion of information in terms of which a logic can be defined, and subsequently developed a theory of information, namely Situation Theory. Their approach was in accordance with Van Rijsbergen' s suggestion of a logic-based formalism for modelling an information retrieval system. This thesis shows that Situation Theory is best at representing all the qualitative features. Regarding the modelling of the quantitative features of information, this thesis shows that the framework that models them best is the Dempster-Shafer Theory of Evidence, together with the notion of refinement, later introduced by Shafer. The third objective of this thesis is to develop a model of an information retrieval system based on Situation Theory and the Dempster-Shafer Theory of Evidence. This is done in two steps. First, the unstructured model is defined in which the structure and the significance of information are not accounted for. Second, the unstructured model is extended into the structured model, which incorporates the structure and the significance of information. This strategy is adopted because it enables the careful representation of the flow of information to be performed first. The final objective of the thesis is to implement the model and to perform empirical evaluation to assess its validity. The unstructured and the structured models are implemented based on an existing on-line thesaurus, known as WordNet. The experiments performed to evaluate the two models use the National Physical Laboratory standard test collection. The experimental performance obtained was poor, because it was difficult to extract the flow of information from the document set. This was mainly due to the data used in the experimentation which was inappropriate for the test collection. However, this thesis shows that if more appropriate data, for example, indexing tools and thesauri, were available, better performances would be obtained. The conclusion of this work was that Situation Theory, combined with the Dempster-Shafer Theory of Evidence, allows the appropriate and powerful representation of several essential features of information in an information retrieval system. Although its implementation presents some difficulties, the model is the first of its kind to capture, in a general manner, these features within a uniform framework. As a result, it can be easily generalized to many types of information retrieval systems (e.g., interactive, multimedia systems), or many aspects of the retrieval process (e.g., user modelling)

    Semantic multimedia modelling & interpretation for search & retrieval

    Get PDF
    With the axiomatic revolutionary in the multimedia equip devices, culminated in the proverbial proliferation of the image and video data. Owing to this omnipresence and progression, these data become the part of our daily life. This devastating data production rate accompanies with a predicament of surpassing our potentials for acquiring this data. Perhaps one of the utmost prevailing problems of this digital era is an information plethora. Until now, progressions in image and video retrieval research reached restrained success owed to its interpretation of an image and video in terms of primitive features. Humans generally access multimedia assets in terms of semantic concepts. The retrieval of digital images and videos is impeded by the semantic gap. The semantic gap is the discrepancy between a user’s high-level interpretation of an image and the information that can be extracted from an image’s physical properties. Content- based image and video retrieval systems are explicitly assailable to the semantic gap due to their dependence on low-level visual features for describing image and content. The semantic gap can be narrowed by including high-level features. High-level descriptions of images and videos are more proficient of apprehending the semantic meaning of image and video content. It is generally understood that the problem of image and video retrieval is still far from being solved. This thesis proposes an approach for intelligent multimedia semantic extraction for search and retrieval. This thesis intends to bridge the gap between the visual features and semantics. This thesis proposes a Semantic query Interpreter for the images and the videos. The proposed Semantic Query Interpreter will select the pertinent terms from the user query and analyse it lexically and semantically. The proposed SQI reduces the semantic as well as the vocabulary gap between the users and the machine. This thesis also explored a novel ranking strategy for image search and retrieval. SemRank is the novel system that will incorporate the Semantic Intensity (SI) in exploring the semantic relevancy between the user query and the available data. The novel Semantic Intensity captures the concept dominancy factor of an image. As we are aware of the fact that the image is the combination of various concepts and among the list of concepts some of them are more dominant then the other. The SemRank will rank the retrieved images on the basis of Semantic Intensity. The investigations are made on the LabelMe image and LabelMe video dataset. Experiments show that the proposed approach is successful in bridging the semantic gap. The experiments reveal that our proposed system outperforms the traditional image retrieval systems
    corecore