815,459 research outputs found

    Multi modal multi-semantic image retrieval

    Get PDF
    PhDThe rapid growth in the volume of visual information, e.g. image, and video can overwhelm users’ ability to find and access the specific visual information of interest to them. In recent years, ontology knowledge-based (KB) image information retrieval techniques have been adopted into in order to attempt to extract knowledge from these images, enhancing the retrieval performance. A KB framework is presented to promote semi-automatic annotation and semantic image retrieval using multimodal cues (visual features and text captions). In addition, a hierarchical structure for the KB allows metadata to be shared that supports multi-semantics (polysemy) for concepts. The framework builds up an effective knowledge base pertaining to a domain specific image collection, e.g. sports, and is able to disambiguate and assign high level semantics to ‘unannotated’ images. Local feature analysis of visual content, namely using Scale Invariant Feature Transform (SIFT) descriptors, have been deployed in the ‘Bag of Visual Words’ model (BVW) as an effective method to represent visual content information and to enhance its classification and retrieval. Local features are more useful than global features, e.g. colour, shape or texture, as they are invariant to image scale, orientation and camera angle. An innovative approach is proposed for the representation, annotation and retrieval of visual content using a hybrid technique based upon the use of an unstructured visual word and upon a (structured) hierarchical ontology KB model. The structural model facilitates the disambiguation of unstructured visual words and a more effective classification of visual content, compared to a vector space model, through exploiting local conceptual structures and their relationships. The key contributions of this framework in using local features for image representation include: first, a method to generate visual words using the semantic local adaptive clustering (SLAC) algorithm which takes term weight and spatial locations of keypoints into account. Consequently, the semantic information is preserved. Second a technique is used to detect the domain specific ‘non-informative visual words’ which are ineffective at representing the content of visual data and degrade its categorisation ability. Third, a method to combine an ontology model with xi a visual word model to resolve synonym (visual heterogeneity) and polysemy problems, is proposed. The experimental results show that this approach can discover semantically meaningful visual content descriptions and recognise specific events, e.g., sports events, depicted in images efficiently. Since discovering the semantics of an image is an extremely challenging problem, one promising approach to enhance visual content interpretation is to use any associated textual information that accompanies an image, as a cue to predict the meaning of an image, by transforming this textual information into a structured annotation for an image e.g. using XML, RDF, OWL or MPEG-7. Although, text and image are distinct types of information representation and modality, there are some strong, invariant, implicit, connections between images and any accompanying text information. Semantic analysis of image captions can be used by image retrieval systems to retrieve selected images more precisely. To do this, a Natural Language Processing (NLP) is exploited firstly in order to extract concepts from image captions. Next, an ontology-based knowledge model is deployed in order to resolve natural language ambiguities. To deal with the accompanying text information, two methods to extract knowledge from textual information have been proposed. First, metadata can be extracted automatically from text captions and restructured with respect to a semantic model. Second, the use of LSI in relation to a domain-specific ontology-based knowledge model enables the combined framework to tolerate ambiguities and variations (incompleteness) of metadata. The use of the ontology-based knowledge model allows the system to find indirectly relevant concepts in image captions and thus leverage these to represent the semantics of images at a higher level. Experimental results show that the proposed framework significantly enhances image retrieval and leads to narrowing of the semantic gap between lower level machinederived and higher level human-understandable conceptualisation

    Research in interactive scene analysis

    Get PDF
    An interactive scene interpretation system (ISIS) was developed as a tool for constructing and experimenting with man-machine and automatic scene analysis methods tailored for particular image domains. A recently developed region analysis subsystem based on the paradigm of Brice and Fennema is described. Using this subsystem a series of experiments was conducted to determine good criteria for initially partitioning a scene into atomic regions and for merging these regions into a final partition of the scene along object boundaries. Semantic (problem-dependent) knowledge is essential for complete, correct partitions of complex real-world scenes. An interactive approach to semantic scene segmentation was developed and demonstrated on both landscape and indoor scenes. This approach provides a reasonable methodology for segmenting scenes that cannot be processed completely automatically, and is a promising basis for a future automatic system. A program is described that can automatically generate strategies for finding specific objects in a scene based on manually designated pictorial examples

    Automated interpretation of digital images of hydrographic charts.

    Get PDF
    Details of research into the automated generation of a digital database of hydrographic charts is presented. Low level processing of digital images of hydrographic charts provides image line feature segments which serve as input to a semi-automated feature extraction system, (SAFE). This system is able to perform a great deal of the building of chart features from the image segments simply on the basis of proximity of the segments. The system solicits user interaction when ambiguities arise. IThe creation of an intelligent knowledge based system (IKBS) implemented in the form of a backward chained production rule based system, which cooperates with the SAFE system, is described. The 1KBS attempts to resolve ambiguities using domain knowledge coded in the form of production rules. The two systems communicate by the passing of goals from SAFE to the IKBS and the return of a certainty factor by the IKBS for each goal submitted. The SAFE system can make additional feature building decisions on the basis of collected sets of certainty factors, thus reducing the need for user interaction. This thesis establishes that the cooperating IKBS approach to image interpretation offers an effective route to automated image understanding

    A Conceptual Framework for Motion Based Music Applications

    Get PDF
    Imaginary projections are the core of the framework for motion based music applications presented in this paper. Their design depends on the space covered by the motion tracking device, but also on the musical feature involved in the application. They can be considered a very powerful tool because they allow not only to project in the virtual environment the image of a traditional acoustic instrument, but also to express any spatially defined abstract concept. The system pipeline starts from the musical content and, through a geometrical interpretation, arrives to its projection in the physical space. Three case studies involving different motion tracking devices and different musical concepts will be analyzed. The three examined applications have been programmed and already tested by the authors. They aim respectively at musical expressive interaction (Disembodied Voices), tonal music knowledge (Harmonic Walk) and XX century music composition (Hand Composer)

    Explicit and persistent knowledge in engineering drawing analysis

    Get PDF
    technical reportDomain knowledge permeates all aspects of the engineering drawing analysis process, including understanding the physical processes operating on the medium (i.e., paper), the image analysis techniques, and the interpretation semantics of the structural layout and contents of the drawing. Additionally, an understanding of the broader reverse engineering context, within which the drawing analysis takes place, should be exploited. Thus as part of a wider project on the reverse engineering of legacy systems, we have developed an agent-based engineering analysis system called NDAS (nonDeterministic Agent System). In this paper, we discuss the nature of such a system and how knowledge can be made explicit (both for agents and humans) and how performance models can be de?ned, calibrated, monitored, and improved over time through the use of persistent knowledge. A framework is proposed that allows computational agents to: (1) explore the threshold space for an optimal analysis of the drawing, (2) control information gain through agent invocation, (3) incorporate and communicate knowledge, and (4) inform the software engineering and system development with deep knowledge of the relationships between modules and their parameters (at least in a statistical sense)
    • …
    corecore