3,914 research outputs found

    A cooperative framework for molecular biology database integration using image object selection

    Get PDF
    The theme and the concept of 'Molecular Biology Database Integration' and the problems associated with this concept initiated the idea for this Ph.D research. The available technologies facilitate to analyse the data independently and discretely but it fails to integrate the data resources for more meaningful information. This along with the integration issues created the scope for this Ph.D research. The research has reviewed the 'database interoperability' problems and it has suggested a framework for integrating the molecular biology databases. The framework has proposed to develop a cooperative environment to share information on the basis of common purpose for the molecular biology databases. The research has also reviewed other implementation and interoperability issues for laboratory based, dedicated and target specific database. The research has addressed the following issues: diversity of molecular biology databases schemas, schema constructs and schema implementation multi-database query using image object keying, database integration technologies using context graph, automated navigation among these databases. This thesis has introduced a new approach for database implementation. It has introduced an interoperable component database concept to initiate multidatabase query on gene mutation data. A number of data models have been proposed for gene mutation data which is the basis for integrating the target specific component database to be integrated with the federated information system. The proposed data models are: data models for genetic trait analysis, classification of gene mutation data, pathological lesion data and laboratory data. The main feature of this component database is non-overlapping attributes and it will follow non-redundant integration approach as explained in the thesis. This will be achieved by storing attributes which will not have the union or intersection of any attributes that exist in public domain molecular biology databases. Unlike data warehousing technique, this feature is quite unique and novel. The component database will be integrated with other biological data sources for sharing information in a cooperative environment. This involves developing new tools. The thesis explains the role of these new tools which are: meta data extractor, mapping linker, query generator and result interpreter. These tools are used for a transparent integration without creating any global schema of the participating databases. The thesis has also established the concept of image object keying for multidatabase query and it has proposed a relevant algorithm for matching protein spot in gel electrophoresis image. An object spot in gel electrophoresis image will initiate the query when it is selected by the user. It matches the selected spot with other similar spots in other resource databases. This image object keying method is an alternative to conventional multidatabase query which requires writing complex SQL scripts. This method also resolve the semantic conflicts that exist among molecular biology databases. The research has proposed a new framework based on the context of the web data for interactions with different biological data resources. A formal description of the resource context is described in the thesis. The implementation of the context into Resource Document Framework (RDF) will be able to increase the interoperability by providing the description of the resources and the navigation plan for accessing the web based databases. A higher level construct is developed (has, provide and access) to implement the context into RDF for web interactions. The interactions within the resources are achieved by utilising an integration domain to extract the required information with a single instance and without writing any query scripts. The integration domain allows to navigate and to execute the query plan within the resource databases. An extractor module collects elements from different target webs and unify them as a whole object in a single page. The proposed framework is tested to find specific information e.g., information on Alzheimer's disease, from public domain biology resources, such as, Protein Data Bank, Genome Data Bank, Online Mendalian Inheritance in Man and local database. Finally, the thesis proposes further propositions and plans for future work

    Multi modal multi-semantic image retrieval

    Get PDF
    PhDThe rapid growth in the volume of visual information, e.g. image, and video can overwhelm users’ ability to find and access the specific visual information of interest to them. In recent years, ontology knowledge-based (KB) image information retrieval techniques have been adopted into in order to attempt to extract knowledge from these images, enhancing the retrieval performance. A KB framework is presented to promote semi-automatic annotation and semantic image retrieval using multimodal cues (visual features and text captions). In addition, a hierarchical structure for the KB allows metadata to be shared that supports multi-semantics (polysemy) for concepts. The framework builds up an effective knowledge base pertaining to a domain specific image collection, e.g. sports, and is able to disambiguate and assign high level semantics to ‘unannotated’ images. Local feature analysis of visual content, namely using Scale Invariant Feature Transform (SIFT) descriptors, have been deployed in the ‘Bag of Visual Words’ model (BVW) as an effective method to represent visual content information and to enhance its classification and retrieval. Local features are more useful than global features, e.g. colour, shape or texture, as they are invariant to image scale, orientation and camera angle. An innovative approach is proposed for the representation, annotation and retrieval of visual content using a hybrid technique based upon the use of an unstructured visual word and upon a (structured) hierarchical ontology KB model. The structural model facilitates the disambiguation of unstructured visual words and a more effective classification of visual content, compared to a vector space model, through exploiting local conceptual structures and their relationships. The key contributions of this framework in using local features for image representation include: first, a method to generate visual words using the semantic local adaptive clustering (SLAC) algorithm which takes term weight and spatial locations of keypoints into account. Consequently, the semantic information is preserved. Second a technique is used to detect the domain specific ‘non-informative visual words’ which are ineffective at representing the content of visual data and degrade its categorisation ability. Third, a method to combine an ontology model with xi a visual word model to resolve synonym (visual heterogeneity) and polysemy problems, is proposed. The experimental results show that this approach can discover semantically meaningful visual content descriptions and recognise specific events, e.g., sports events, depicted in images efficiently. Since discovering the semantics of an image is an extremely challenging problem, one promising approach to enhance visual content interpretation is to use any associated textual information that accompanies an image, as a cue to predict the meaning of an image, by transforming this textual information into a structured annotation for an image e.g. using XML, RDF, OWL or MPEG-7. Although, text and image are distinct types of information representation and modality, there are some strong, invariant, implicit, connections between images and any accompanying text information. Semantic analysis of image captions can be used by image retrieval systems to retrieve selected images more precisely. To do this, a Natural Language Processing (NLP) is exploited firstly in order to extract concepts from image captions. Next, an ontology-based knowledge model is deployed in order to resolve natural language ambiguities. To deal with the accompanying text information, two methods to extract knowledge from textual information have been proposed. First, metadata can be extracted automatically from text captions and restructured with respect to a semantic model. Second, the use of LSI in relation to a domain-specific ontology-based knowledge model enables the combined framework to tolerate ambiguities and variations (incompleteness) of metadata. The use of the ontology-based knowledge model allows the system to find indirectly relevant concepts in image captions and thus leverage these to represent the semantics of images at a higher level. Experimental results show that the proposed framework significantly enhances image retrieval and leads to narrowing of the semantic gap between lower level machinederived and higher level human-understandable conceptualisation

    Semantic multimedia modelling & interpretation for annotation

    Get PDF
    The emergence of multimedia enabled devices, particularly the incorporation of cameras in mobile phones, and the accelerated revolutions in the low cost storage devices, boosts the multimedia data production rate drastically. Witnessing such an iniquitousness of digital images and videos, the research community has been projecting the issue of its significant utilization and management. Stored in monumental multimedia corpora, digital data need to be retrieved and organized in an intelligent way, leaning on the rich semantics involved. The utilization of these image and video collections demands proficient image and video annotation and retrieval techniques. Recently, the multimedia research community is progressively veering its emphasis to the personalization of these media. The main impediment in the image and video analysis is the semantic gap, which is the discrepancy among a user’s high-level interpretation of an image and the video and the low level computational interpretation of it. Content-based image and video annotation systems are remarkably susceptible to the semantic gap due to their reliance on low-level visual features for delineating semantically rich image and video contents. However, the fact is that the visual similarity is not semantic similarity, so there is a demand to break through this dilemma through an alternative way. The semantic gap can be narrowed by counting high-level and user-generated information in the annotation. High-level descriptions of images and or videos are more proficient of capturing the semantic meaning of multimedia content, but it is not always applicable to collect this information. It is commonly agreed that the problem of high level semantic annotation of multimedia is still far from being answered. This dissertation puts forward approaches for intelligent multimedia semantic extraction for high level annotation. This dissertation intends to bridge the gap between the visual features and semantics. It proposes a framework for annotation enhancement and refinement for the object/concept annotated images and videos datasets. The entire theme is to first purify the datasets from noisy keyword and then expand the concepts lexically and commonsensical to fill the vocabulary and lexical gap to achieve high level semantics for the corpus. This dissertation also explored a novel approach for high level semantic (HLS) propagation through the images corpora. The HLS propagation takes the advantages of the semantic intensity (SI), which is the concept dominancy factor in the image and annotation based semantic similarity of the images. As we are aware of the fact that the image is the combination of various concepts and among the list of concepts some of them are more dominant then the other, while semantic similarity of the images are based on the SI and concept semantic similarity among the pair of images. Moreover, the HLS exploits the clustering techniques to group similar images, where a single effort of the human experts to assign high level semantic to a randomly selected image and propagate to other images through clustering. The investigation has been made on the LabelMe image and LabelMe video dataset. Experiments exhibit that the proposed approaches perform a noticeable improvement towards bridging the semantic gap and reveal that our proposed system outperforms the traditional systems

    Content Based Image Retrieval (CBIR) in Remote Clinical Diagnosis and Healthcare

    Full text link
    Content-Based Image Retrieval (CBIR) locates, retrieves and displays images alike to one given as a query, using a set of features. It demands accessible data in medical archives and from medical equipment, to infer meaning after some processing. A problem similar in some sense to the target image can aid clinicians. CBIR complements text-based retrieval and improves evidence-based diagnosis, administration, teaching, and research in healthcare. It facilitates visual/automatic diagnosis and decision-making in real-time remote consultation/screening, store-and-forward tests, home care assistance and overall patient surveillance. Metrics help comparing visual data and improve diagnostic. Specially designed architectures can benefit from the application scenario. CBIR use calls for file storage standardization, querying procedures, efficient image transmission, realistic databases, global availability, access simplicity, and Internet-based structures. This chapter recommends important and complex aspects required to handle visual content in healthcare.Comment: 28 pages, 6 figures, Book Chapter from "Encyclopedia of E-Health and Telemedicine

    Forum Session at the First International Conference on Service Oriented Computing (ICSOC03)

    Get PDF
    The First International Conference on Service Oriented Computing (ICSOC) was held in Trento, December 15-18, 2003. The focus of the conference ---Service Oriented Computing (SOC)--- is the new emerging paradigm for distributed computing and e-business processing that has evolved from object-oriented and component computing to enable building agile networks of collaborating business applications distributed within and across organizational boundaries. Of the 181 papers submitted to the ICSOC conference, 10 were selected for the forum session which took place on December the 16th, 2003. The papers were chosen based on their technical quality, originality, relevance to SOC and for their nature of being best suited for a poster presentation or a demonstration. This technical report contains the 10 papers presented during the forum session at the ICSOC conference. In particular, the last two papers in the report ere submitted as industrial papers

    LinkedScales : bases de dados em multiescala

    Get PDF
    Orientador: AndrĂ© SantanchĂšTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: As ciĂȘncias biolĂłgicas e mĂ©dicas precisam cada vez mais de abordagens unificadas para a anĂĄlise de dados, permitindo a exploração da rede de relacionamentos e interaçÔes entre elementos. No entanto, dados essenciais estĂŁo frequentemente espalhados por um conjunto cada vez maior de fontes com mĂșltiplos nĂ­veis de heterogeneidade entre si, tornando a integração cada vez mais complexa. Abordagens de integração existentes geralmente adotam estratĂ©gias especializadas e custosas, exigindo a produção de soluçÔes monolĂ­ticas para lidar com formatos e esquemas especĂ­ficos. Para resolver questĂ”es de complexidade, essas abordagens adotam soluçÔes pontuais que combinam ferramentas e algoritmos, exigindo adaptaçÔes manuais. Abordagens nĂŁo sistemĂĄticas dificultam a reutilização de tarefas comuns e resultados intermediĂĄrios, mesmo que esses possam ser Ășteis em anĂĄlises futuras. AlĂ©m disso, Ă© difĂ­cil o rastreamento de transformaçÔes e demais informaçÔes de proveniĂȘncia, que costumam ser negligenciadas. Este trabalho propĂ”e LinkedScales, um dataspace baseado em mĂșltiplos nĂ­veis, projetado para suportar a construção progressiva de visĂ”es unificadas de fontes heterogĂȘneas. LinkedScales sistematiza as mĂșltiplas etapas de integração em escalas, partindo de representaçÔes brutas (escalas mais baixas), indo gradualmente para estruturas semelhantes a ontologias (escalas mais altas). LinkedScales define um modelo de dados e um processo de integração sistemĂĄtico e sob demanda, atravĂ©s de transformaçÔes em um banco de dados de grafos. Resultados intermediĂĄrios sĂŁo encapsulados em escalas reutilizĂĄveis e transformaçÔes entre escalas sĂŁo rastreadas em um grafo de proveniĂȘncia ortogonal, que conecta objetos entre escalas. Posteriormente, consultas ao dataspace podem considerar objetos nas escalas e o grafo de proveniĂȘncia ortogonal. AplicaçÔes prĂĄticas de LinkedScales sĂŁo tratadas atravĂ©s de dois estudos de caso, um no domĂ­nio da biologia -- abordando um cenĂĄrio de anĂĄlise centrada em organismos -- e outro no domĂ­nio mĂ©dico -- com foco em dados de medicina baseada em evidĂȘnciasAbstract: Biological and medical sciences increasingly need a unified, network-driven approach for exploring relationships and interactions among data elements. Nevertheless, essential data is frequently scattered across sources with multiple levels of heterogeneity. Existing data integration approaches usually adopt specialized, heavyweight strategies, requiring a costly upfront effort to produce monolithic solutions for handling specific formats and schemas. Furthermore, such ad-hoc strategies hamper the reuse of intermediary integration tasks and outcomes. This work proposes LinkedScales, a multiscale-based dataspace designed to support the progressive construction of a unified view of heterogeneous sources. It departs from raw representations (lower scales) and goes towards ontology-like structures (higher scales). LinkedScales defines a data model and a systematic, gradual integration process via operations over a graph database. Intermediary outcomes are encapsulated as reusable scales, tracking the provenance of inter-scale operations. Later, queries can combine both scale data and orthogonal provenance information. Practical applications of LinkedScales are discussed through two case studies on the biology domain -- addressing an organism-centric analysis scenario -- and the medical domain -- focusing on evidence-based medicine dataDoutoradoCiĂȘncia da ComputaçãoDoutor em CiĂȘncia da Computação141353/2015-5CAPESCNP

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Integrating case based reasoning and geographic information systems in a planing support system: Çeşme Peninsula study

    Get PDF
    Thesis (Doctoral)--Izmir Institute of Technology, City and Regional Planning, Izmir, 2009Includes bibliographical references (leaves: 110-121)Text in English; Abstract: Turkish and Englishxii, 140 leavesUrban and regional planning is experiencing fundamental changes on the use of of computer-based models in planning practice and education. However, with this increased use, .Geographic Information Systems. (GIS) or .Computer Aided Design.(CAD) alone cannot serve all of the needs of planning. Computational approaches should be modified to deal better with the imperatives of contemporary planning by using artificial intelligence techniques in city planning process.The main aim of this study is to develop an integrated .Planning Support System. (PSS) tool for supporting the planning process. In this research, .Case Based Reasoning. (CBR) .an artificial intelligence technique- and .Geographic Information Systems. (GIS) .geographic analysis, data management and visualization techniqueare used as a major PSS tools to build a .Case Based System. (CBS) for knowledge representation on an operational study. Other targets of the research are to discuss the benefits of CBR method in city planning domain and to demonstrate the feasibility and usefulness of this technique in a PSS. .Çeşme Peninsula. case study which applied under the desired methodology is presented as an experimental and operational stage of the thesis.This dissertation tried to find out whether an integrated model which employing CBR&GIS could support human decision making in a city planning task. While the CBS model met many of predefined goals of the thesis, both advantages and limitations have been realized from findings when applied to the complex domain such as city planning

    Developing tools and models for evaluating geospatial data integration of official and VGI data sources

    Get PDF
    PhD ThesisIn recent years, systems have been developed which enable users to produce, share and update information on the web effectively and freely as User Generated Content (UGC) data (including Volunteered Geographic Information (VGI)). Data quality assessment is a major concern for supporting the accurate and efficient spatial data integration required if VGI is to be used alongside official, formal, usually governmental datasets. This thesis aims to develop tools and models for the purpose of assessing such integration possibilities. Initially, in order to undertake this task, geometrical similarity of formal and informal data was examined. Geometrical analyses were performed by developing specific programme interfaces to assess the positional, linear and polygon shape similarity among reference field survey data (FS); official datasets such as data from Ordnance Survey (OS), UK and General Directorate for Survey (GDS), Iraq agencies; and VGI information such as OpenStreetMap (OSM) datasets. A discussion of the design and implementation of these tools and interfaces is presented. A methodology has been developed to assess such positional and shape similarity by applying different metrics and standard indices such as the National Standard for Spatial Data Accuracy (NSSDA) for positional quality; techniques such as buffering overlays for linear similarity; and application of moments invariant for polygon shape similarity evaluations. The results suggested that difficulties exist for any geometrical integration of OSM data with both bench mark FS and formal datasets, but that formal data is very close to reference datasets. An investigation was carried out into contributing factors such as data sources, feature types and number of data collectors that may affect the geometrical quality of OSM data and consequently affect the integration process of OSM datasets with FS, OS and GDS. Factorial designs were undertaken in this study in order to develop and implement an experiment to discover the effect of these factors individually and the interaction between each of them. The analysis found that data source is the most significant factor that affects the geometrical quality of OSM datasets, and that there are interactions among all these factors at different levels of interaction. This work also investigated the possibility of integrating feature classification of official datasets such as data from OS and GDS geospatial data agencies, and informal datasets such as OSM information. In this context, two different models were developed. The first set of analysis included the evaluation of semantic integration of corresponding feature classifications of compared datasets. The second model was concerned with assessing the ability of XML schema matching of feature classifications of tested datasets. This initially involved a tokenization process in order to split up into single words classifications that were composed of multiple words. Subsequently, encoding feature classifications as XML schema trees was undertaken. The semantic similarity, data type similarity and structural similarity were measured between the nodes of compared schema trees. Once these three similarities had been computed, a weighted combination technique has been adopted in order to obtain the overall similarity. The findings of both sets of analysis were not encouraging as far as the possibility of effectively integrating feature classifications of VGI datasets, such as OSM information, and formal datasets, such as OS and GDS datasets, is concerned.Ministry of Higher Education and Scientific Research, Republic of Iraq
    • 

    corecore