1,847 research outputs found

    Constraints preserving genetic algorithm for learning fuzzy measures with an application to ontology matching

    Get PDF
    Abstract. Both the fuzzy measure and integral have been widely studied for multi-source information fusion. A number of researchers have proposed optimization techniques to learn a fuzzy measure from training data. In part, this task is difficult as the fuzzy measure can have a large number of free parameters (2 N − 2 for N sources) and it has many (monotonicity) constraints. In this paper, a new genetic algorithm approach to constraint preserving optimization of the fuzzy measure is present for the task of learning and fusing different ontology matching results. Preliminary results are presented to show the stability of the leaning algorithm and its effectiveness compared to existing approaches

    Reshare an Operational Ontology Framework for Research Modeling, Combining and Sharing

    Get PDF
    Scientists always face difficulties dealing with disjointed information. There is a need for a standardized and robust way to represent and exchange knowledge. Ontology has been widely used for this purpose. However, since research involves semantics and operations, we need to conceptualize both of them. In this thesis, we propose ReShare to provide a solution for this problem. Maximizing utilization while preserving the semantics is one of the main challenges when the heterogeneous knowledge is combined. Therefore, operational annotations were designed to allow generic object modeling, binding and representation. Furthermore, a test bed is developed and preliminary results are presented to show the usefulness and robustness of our approach. Moreover, two aggregation techniques for fusing ontology matchers are investigated as an initial work for building an algorithm which converts descriptive ontologies into operational ones

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    Improving average ranking precision in user searches for biomedical research datasets

    Full text link
    Availability of research datasets is keystone for health and life science study reproducibility and scientific progress. Due to the heterogeneity and complexity of these data, a main challenge to be overcome by research data management systems is to provide users with the best answers for their search queries. In the context of the 2016 bioCADDIE Dataset Retrieval Challenge, we investigate a novel ranking pipeline to improve the search of datasets used in biomedical experiments. Our system comprises a query expansion model based on word embeddings, a similarity measure algorithm that takes into consideration the relevance of the query terms, and a dataset categorisation method that boosts the rank of datasets matching query constraints. The system was evaluated using a corpus with 800k datasets and 21 annotated user queries. Our system provides competitive results when compared to the other challenge participants. In the official run, it achieved the highest infAP among the participants, being +22.3% higher than the median infAP of the participant's best submissions. Overall, it is ranked at top 2 if an aggregated metric using the best official measures per participant is considered. The query expansion method showed positive impact on the system's performance increasing our baseline up to +5.0% and +3.4% for the infAP and infNDCG metrics, respectively. Our similarity measure algorithm seems to be robust, in particular compared to Divergence From Randomness framework, having smaller performance variations under different training conditions. Finally, the result categorization did not have significant impact on the system's performance. We believe that our solution could be used to enhance biomedical dataset management systems. In particular, the use of data driven query expansion methods could be an alternative to the complexity of biomedical terminologies

    Monte Carlo Method with Heuristic Adjustment for Irregularly Shaped Food Product Volume Measurement

    Get PDF
    Volume measurement plays an important role in the production and processing of food products. Various methods have been proposed to measure the volume of food products with irregular shapes based on 3D reconstruction. However, 3D reconstruction comes with a high-priced computational cost. Furthermore, some of the volume measurement methods based on 3D reconstruction have a low accuracy. Another method for measuring volume of objects uses Monte Carlo method. Monte Carlo method performs volume measurements using random points. Monte Carlo method only requires information regarding whether random points fall inside or outside an object and does not require a 3D reconstruction. This paper proposes volume measurement using a computer vision system for irregularly shaped food products without 3D reconstruction based on Monte Carlo method with heuristic adjustment. Five images of food product were captured using five cameras and processed to produce binary images. Monte Carlo integration with heuristic adjustment was performed to measure the volume based on the information extracted from binary images. The experimental results show that the proposed method provided high accuracy and precision compared to the water displacement method. In addition, the proposed method is more accurate and faster than the space carving method

    Un environnement de spécification et de découverte pour la réutilisation des composants logiciels dans le développement des logiciels distribués

    Get PDF
    Notre travail vise Ă  Ă©laborer une solution efficace pour la dĂ©couverte et la rĂ©utilisation des composants logiciels dans les environnements de dĂ©veloppement existants et couramment utilisĂ©s. Nous proposons une ontologie pour dĂ©crire et dĂ©couvrir des composants logiciels Ă©lĂ©mentaires. La description couvre Ă  la fois les propriĂ©tĂ©s fonctionnelles et les propriĂ©tĂ©s non fonctionnelles des composants logiciels exprimĂ©es comme des paramĂštres de QoS. Notre processus de recherche est basĂ© sur la fonction qui calcule la distance sĂ©mantique entre la signature d'un composant et la signature d'une requĂȘte donnĂ©e, rĂ©alisant ainsi une comparaison judicieuse. Nous employons Ă©galement la notion de " subsumption " pour comparer l'entrĂ©e-sortie de la requĂȘte et des composants. AprĂšs sĂ©lection des composants adĂ©quats, les propriĂ©tĂ©s non fonctionnelles sont employĂ©es comme un facteur distinctif pour raffiner le rĂ©sultat de publication des composants rĂ©sultats. Nous proposons une approche de dĂ©couverte des composants composite si aucun composant Ă©lĂ©mentaire n'est trouvĂ©, cette approche basĂ©e sur l'ontologie commune. Pour intĂ©grer le composant rĂ©sultat dans le projet en cours de dĂ©veloppement, nous avons dĂ©veloppĂ© l'ontologie d'intĂ©gration et les deux services " input/output convertor " et " output Matching ".Our work aims to develop an effective solution for the discovery and the reuse of software components in existing and commonly used development environments. We propose an ontology for describing and discovering atomic software components. The description covers both the functional and non functional properties which are expressed as QoS parameters. Our search process is based on the function that calculates the semantic distance between the component interface signature and the signature of a given query, thus achieving an appropriate comparison. We also use the notion of "subsumption" to compare the input/output of the query and the components input/output. After selecting the appropriate components, the non-functional properties are used to refine the search result. We propose an approach for discovering composite components if any atomic component is found, this approach based on the shared ontology. To integrate the component results in the project under development, we developed the ontology integration and two services " input/output convertor " and " output Matching "

    A Multiobjective Evolutionary Conceptual Clustering Methodology for Gene Annotation Within Structural Databases: A Case of Study on the Gene Ontology Database

    Get PDF
    Current tools and techniques devoted to examine the content of large databases are often hampered by their inability to support searches based on criteria that are meaningful to their users. These shortcomings are particularly evident in data banks storing representations of structural data such as biological networks. Conceptual clustering techniques have demonstrated to be appropriate for uncovering relationships between features that characterize objects in structural data. However, typical con ceptual clustering approaches normally recover the most obvious relations, but fail to discover the lessfrequent but more informative underlying data associations. The combination of evolutionary algorithms with multiobjective and multimodal optimization techniques constitutes a suitable tool for solving this problem. We propose a novel conceptual clustering methodology termed evolutionary multiobjective conceptual clustering (EMO-CC), re lying on the NSGA-II multiobjective (MO) genetic algorithm. We apply this methodology to identify conceptual models in struc tural databases generated from gene ontologies. These models can explain and predict phenotypes in the immunoinflammatory response problem, similar to those provided by gene expression or other genetic markers. The analysis of these results reveals that our approach uncovers cohesive clusters, even those comprising a small number of observations explained by several features, which allows describing objects and their interactions from different perspectives and at different levels of detail.Ministerio de Ciencia y TecnologĂ­a TIC-2003-00877Ministerio de Ciencia y TecnologĂ­a BIO2004-0270EMinisterio de Ciencia y TecnologĂ­a TIN2006-1287

    Semantic Similarity of Spatial Scenes

    Get PDF
    The formalization of similarity in spatial information systems can unleash their functionality and contribute technology not only useful, but also desirable by broad groups of users. As a paradigm for information retrieval, similarity supersedes tedious querying techniques and unveils novel ways for user-system interaction by naturally supporting modalities such as speech and sketching. As a tool within the scope of a broader objective, it can facilitate such diverse tasks as data integration, landmark determination, and prediction making. This potential motivated the development of several similarity models within the geospatial and computer science communities. Despite the merit of these studies, their cognitive plausibility can be limited due to neglect of well-established psychological principles about properties and behaviors of similarity. Moreover, such approaches are typically guided by experience, intuition, and observation, thereby often relying on more narrow perspectives or restrictive assumptions that produce inflexible and incompatible measures. This thesis consolidates such fragmentary efforts and integrates them along with novel formalisms into a scalable, comprehensive, and cognitively-sensitive framework for similarity queries in spatial information systems. Three conceptually different similarity queries at the levels of attributes, objects, and scenes are distinguished. An analysis of the relationship between similarity and change provides a unifying basis for the approach and a theoretical foundation for measures satisfying important similarity properties such as asymmetry and context dependence. The classification of attributes into categories with common structural and cognitive characteristics drives the implementation of a small core of generic functions, able to perform any type of attribute value assessment. Appropriate techniques combine such atomic assessments to compute similarities at the object level and to handle more complex inquiries with multiple constraints. These techniques, along with a solid graph-theoretical methodology adapted to the particularities of the geospatial domain, provide the foundation for reasoning about scene similarity queries. Provisions are made so that all methods comply with major psychological findings about people’s perceptions of similarity. An experimental evaluation supplies the main result of this thesis, which separates psychological findings with a major impact on the results from those that can be safely incorporated into the framework through computationally simpler alternatives

    Memetic algorithms for ontology alignment

    Get PDF
    2011 - 2012Semantic interoperability represents the capability of two or more systems to meaningfully and accurately interpret the exchanged data so as to produce useful results. It is an essential feature of all distributed and open knowledge based systems designed for both e-government and private businesses, since it enables machine interpretation, inferencing and computable logic. Unfortunately, the task of achieving semantic interoperability is very difficult because it requires that the meanings of any data must be specified in an appropriate detail in order to resolve any potential ambiguity. Currently, the best technology recognized for achieving such level of precision in specification of meaning is represented by ontologies. According to the most frequently referenced definition [1], an ontology is an explicit specification of a conceptualization, i.e., the formal specification of the objects, concepts, and other entities that are presumed to exist in some area of interest and the relationships that hold them [2]. However, different tasks or different points of view lead ontology designers to produce different conceptualizations of the same domain of interest. This means that the subjectivity of the ontology modeling results in the creation of heterogeneous ontologies characterized by terminological and conceptual discrepancies. Examples of these discrepancies are the use of different words to name the same concept, the use of the same word to name different concepts, the creation of hierarchies for a specific domain region with different levels of detail and so on. The arising so-called semantic heterogeneity problem represents, in turn, an obstacle for achieving semantic interoperability... [edited by author]XI n.s
    • 

    corecore