Search CORE

815,459 research outputs found

Multi modal multi-semantic image retrieval

Author: Kesorn Kraisak
Publication venue
Publication date: 01/01/2010
Field of study

PhDThe rapid growth in the volume of visual information, e.g. image, and video can overwhelm users’ ability to find and access the specific visual information of interest to them. In recent years, ontology knowledge-based (KB) image information retrieval techniques have been adopted into in order to attempt to extract knowledge from these images, enhancing the retrieval performance. A KB framework is presented to promote semi-automatic annotation and semantic image retrieval using multimodal cues (visual features and text captions). In addition, a hierarchical structure for the KB allows metadata to be shared that supports multi-semantics (polysemy) for concepts. The framework builds up an effective knowledge base pertaining to a domain specific image collection, e.g. sports, and is able to disambiguate and assign high level semantics to ‘unannotated’ images. Local feature analysis of visual content, namely using Scale Invariant Feature Transform (SIFT) descriptors, have been deployed in the ‘Bag of Visual Words’ model (BVW) as an effective method to represent visual content information and to enhance its classification and retrieval. Local features are more useful than global features, e.g. colour, shape or texture, as they are invariant to image scale, orientation and camera angle. An innovative approach is proposed for the representation, annotation and retrieval of visual content using a hybrid technique based upon the use of an unstructured visual word and upon a (structured) hierarchical ontology KB model. The structural model facilitates the disambiguation of unstructured visual words and a more effective classification of visual content, compared to a vector space model, through exploiting local conceptual structures and their relationships. The key contributions of this framework in using local features for image representation include: first, a method to generate visual words using the semantic local adaptive clustering (SLAC) algorithm which takes term weight and spatial locations of keypoints into account. Consequently, the semantic information is preserved. Second a technique is used to detect the domain specific ‘non-informative visual words’ which are ineffective at representing the content of visual data and degrade its categorisation ability. Third, a method to combine an ontology model with xi a visual word model to resolve synonym (visual heterogeneity) and polysemy problems, is proposed. The experimental results show that this approach can discover semantically meaningful visual content descriptions and recognise specific events, e.g., sports events, depicted in images efficiently. Since discovering the semantics of an image is an extremely challenging problem, one promising approach to enhance visual content interpretation is to use any associated textual information that accompanies an image, as a cue to predict the meaning of an image, by transforming this textual information into a structured annotation for an image e.g. using XML, RDF, OWL or MPEG-7. Although, text and image are distinct types of information representation and modality, there are some strong, invariant, implicit, connections between images and any accompanying text information. Semantic analysis of image captions can be used by image retrieval systems to retrieve selected images more precisely. To do this, a Natural Language Processing (NLP) is exploited firstly in order to extract concepts from image captions. Next, an ontology-based knowledge model is deployed in order to resolve natural language ambiguities. To deal with the accompanying text information, two methods to extract knowledge from textual information have been proposed. First, metadata can be extracted automatically from text captions and restructured with respect to a semantic model. Second, the use of LSI in relation to a domain-specific ontology-based knowledge model enables the combined framework to tolerate ambiguities and variations (incompleteness) of metadata. The use of the ontology-based knowledge model allows the system to find indirectly relevant concepts in image captions and thus leverage these to represent the semantics of images at a higher level. Experimental results show that the proposed framework significantly enhances image retrieval and leads to narrowing of the semantic gap between lower level machinederived and higher level human-understandable conceptualisation

Queen Mary Research Online

Research in interactive scene analysis

Author: Garvey T. D.
Tenenbaum J. M.
Weyl S. A.
Wolf H. C.
Publication venue
Publication date
Field of study

An interactive scene interpretation system (ISIS) was developed as a tool for constructing and experimenting with man-machine and automatic scene analysis methods tailored for particular image domains. A recently developed region analysis subsystem based on the paradigm of Brice and Fennema is described. Using this subsystem a series of experiments was conducted to determine good criteria for initially partitioning a scene into atomic regions and for merging these regions into a final partition of the scene along object boundaries. Semantic (problem-dependent) knowledge is essential for complete, correct partitions of complex real-world scenes. An interactive approach to semantic scene segmentation was developed and demonstrated on both landscape and indoor scenes. This approach provides a reasonable methodology for segmenting scenes that cannot be processed completely automatically, and is a promising basis for a future automatic system. A program is described that can automatically generate strategies for finding specific objects in a scene based on manually designated pictorial examples

NASA Technical Reports Server

Automated interpretation of digital images of hydrographic charts.

Author: Goodson Kelvin J.
Publication venue
Publication date
Field of study

Details of research into the automated generation of a digital database of hydrographic charts is presented. Low level processing of digital images of hydrographic charts provides image line feature segments which serve as input to a semi-automated feature extraction system, (SAFE). This system is able to perform a great deal of the building of chart features from the image segments simply on the basis of proximity of the segments. The system solicits user interaction when ambiguities arise. IThe creation of an intelligent knowledge based system (IKBS) implemented in the form of a backward chained production rule based system, which cooperates with the SAFE system, is described. The 1KBS attempts to resolve ambiguities using domain knowledge coded in the form of production rules. The two systems communicate by the passing of goals from SAFE to the IKBS and the return of a certainty factor by the IKBS for each goal submitted. The SAFE system can make additional feature building decisions on the basis of collected sets of certainty factors, thus reducing the need for user interaction. This thesis establishes that the cooperating IKBS approach to image interpretation offers an effective route to automated image understanding

Bournemouth University Research Online

A Conceptual Framework for Motion Based Music Applications

Author: CANAZZA TARGON Sergio
Mandanici Marcella
Roda' Antonio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Imaginary projections are the core of the framework for motion based music applications presented in this paper. Their design depends on the space covered by the motion tracking device, but also on the musical feature involved in the application. They can be considered a very powerful tool because they allow not only to project in the virtual environment the image of a traditional acoustic instrument, but also to express any spatially defined abstract concept. The system pipeline starts from the musical content and, through a geometrical interpretation, arrives to its projection in the physical space. Three case studies involving different motion tracking devices and different musical concepts will be analyzed. The three examined applications have been programmed and already tested by the authors. They aim respectively at musical expressive interaction (Disembodied Voices), tonal music knowledge (Harmonic Walk) and XX century music composition (Hand Composer)

Crossref

Archivio istituzionale della ricerca - Università di Padova

Recommended from our members

A domain independent adaptive imaging system for visual inspection

Author: Panayiotou Stephen
Publication venue
Publication date: 01/11/1995
Field of study

Computer vision is a rapidly growing area. The range of applications is increasing very quickly, robotics, inspection, medicine, physics and document processing are all computer vision applications still in their infancy. All these applications are written with a specific task in mind and do not perform well unless there under a controlled environment. They do not deploy any knowledge to produce a meaningful description of the scene, or indeed aid in the analysis of the image. The construction of a symbolic description of a scene from a digitised image is a difficult problem. A symbolic interpretation of an image can be viewed as a mapping from the image pixels to an identification of the semantically relevant objects. Before symbolic reasoning can take place image processing and segmentation routines must produce the relevant information. This part of the imaging system inherently introduces many errors. The aim of this project is to reduce the error rate produced by such algorithms and make them adaptable to change in the manufacturing process. Thus a prior knowledge is needed about the image and the objects they contain as well as knowledge about how the image was acquired from the scene (image geometry, quality, object decomposition, lighting conditions etc,). Knowledge on algorithms must also be acquired. Such knowledge is collected by studying the algorithms and deciding in which areas of image analysis they work well in. In most existing image analysis systems, knowledge of this kind is implicitly embedded into the algorithms employed in the system. Such an approach assumes that all these parameters are invariant. However, in complex applications this may not be the case, so that adjustment must be made from time to time to ensure a satisfactory performance of the system. A system that allows for such adjustments to be made, must comprise the explicit representation of the knowledge utilised in the image analysis procedure. In addition to the use of a priori knowledge, rules are employed to improve the performance of the image processing and segmentation algorithms. These rules considerably enhance the correctness of the segmentation process. The most frequently given goal, if not the only one in industrial image analysis is to detect and locate objects of a given type in the image. That is, an image may contain objects of different types, and the goal is to identify parts of the image. The system developed here is driven by these goals, and thus by teaching the system a new object or fault in an object the system may adapt the algorithms to detect these new objects as well compromise for changes in the environment such as a change in lighting conditions. We have called this system the Visual Planner, this is due to the fact that we use techniques based on planning to achieve a given goal. As the Visual Planner learns the specific domain it is working in, appropriate algorithms are selected to segment the object. This makes the system domain independent, because different algorithms may be selected for different applications and objects under different environmental condition

Greenwich Academic Literature Archive

Explicit and persistent knowledge in engineering drawing analysis

Author: Henderson Thomas C.
Publication venue: University of Utah
Publication date: 10/10/2003
Field of study

technical reportDomain knowledge permeates all aspects of the engineering drawing analysis process, including understanding the physical processes operating on the medium (i.e., paper), the image analysis techniques, and the interpretation semantics of the structural layout and contents of the drawing. Additionally, an understanding of the broader reverse engineering context, within which the drawing analysis takes place, should be exploited. Thus as part of a wider project on the reverse engineering of legacy systems, we have developed an agent-based engineering analysis system called NDAS (nonDeterministic Agent System). In this paper, we discuss the nature of such a system and how knowledge can be made explicit (both for agents and humans) and how performance models can be de?ned, calibrated, monitored, and improved over time through the use of persistent knowledge. A framework is proposed that allows computational agents to: (1) explore the threshold space for an optimal analysis of the drawing, (2) control information gain through agent invocation, (3) incorporate and communicate knowledge, and (4) inform the software engineering and system development with deep knowledge of the relationships between modules and their parameters (at least in a statistical sense)

The University of Utah: J. Willard Marriott Digital Library