18 research outputs found

    Entity reconciliation in big data sources: A systematic mapping study

    Get PDF
    The entity reconciliation (ER) problem aroused much interest as a research topic in today’s Big Dataera, full of big and open heterogeneous data sources. This problem poses when relevant information ona topic needs to be obtained using methods based on: (i) identifying records that represent the samereal world entity, and (ii) identifying those records that are similar but do not correspond to the samereal-world entity. ER is an operational intelligence process, whereby organizations can unify differentand heterogeneous data sources in order to relate possible matches of non-obvious entities. Besides, thecomplexity that the heterogeneity of data sources involves, the large number of records and differencesamong languages, for instance, must be added. This paper describes a Systematic Mapping Study (SMS) ofjournal articles, conferences and workshops published from 2010 to 2017 to solve the problem describedbefore, first trying to understand the state-of-the-art, and then identifying any gaps in current research.Eleven digital libraries were analyzed following a systematic, semiautomatic and rigorous process thathas resulted in 61 primary studies. They represent a great variety of intelligent proposals that aim tosolve ER. The conclusion obtained is that most of the research is based on the operational phase asopposed to the design phase, and most studies have been tested on real-world data sources, where a lotof them are heterogeneous, but just a few apply to industry. There is a clear trend in research techniquesbased on clustering/blocking and graphs, although the level of automation of the proposals is hardly evermentioned in the research work.Ministerio de Economía y Competitividad TIN2013-46928-C3-3-RMinisterio de Economía y Competitividad TIN2016-76956-C3-2-RMinisterio de Economía y Competitividad TIN2015-71938-RED

    An ontology matching approach for semantic modeling: A case study in smart cities

    Get PDF
    This paper investigates the semantic modeling of smart cities and proposes two ontology matching frameworks, called Clustering for Ontology Matching-based Instances (COMI) and Pattern mining for Ontology Matching-based Instances (POMI). The goal is to discover the relevant knowledge by investigating the correlations among smart city data based on clustering and pattern mining approaches. The COMI method first groups the highly correlated ontologies of smart-city data into similar clusters using the generic k-means algorithm. The key idea of this method is that it clusters the instances of each ontology and then matches two ontologies by matching their clusters and the corresponding instances within the clusters. The POMI method studies the correlations among the data properties and selects the most relevant properties for the ontology matching process. To demonstrate the usefulness and accuracy of the COMI and POMI frameworks, several experiments on the DBpedia, Ontology Alignment Evaluation Initiative, and NOAA ontology databases were conducted. The results show that COMI and POMI outperform the state-of-the-art ontology matching models regarding computational cost without losing the quality during the matching process. Furthermore, these results confirm the ability of COMI and POMI to deal with heterogeneous large-scale data in smart-city environments.publishedVersio

    A model-driven engineering approach for the uniquely identity reconciliation of heterogeneous data sources.

    Get PDF
    The objectives to be achieved with this Doctoral Thesis are: 1. Perform a study of the state of the art of the different existing solutions for the entity reconciliation of heterogeneous data sources, checking if they are being used in real environments. 2. Define and develop a Framework for designing the entity reconciliation models by a systematic way for the requirement, analysis and testing phases of a software methodology. For this purpose, this objective has been divided in three sub objectives: a. Define a set of activities, represented as a process which can be added to any software development methodology to carry out the activities related to the entity reconciliation in the requirement, analysis and testing phase of any software development life cycle. b. Define a metamodel that allows us to represent an abstract view of our model-based approach. c. Define a set of derivation mechanisms that allow to stablish the base for automate the testing of the solutions where the framework proposed in this doctoral thesis has been used. Considering that the process will be applied in the early stages of the development, it is possible to say that this proposal applies Early Testing. 3. Provide a support tool for the framework. The support tool will allow to a software engineer to define the analysis model of an entity reconciliation problem between different and heterogeneous data sources. The tool will be represented as a Domain Specific Language (DSL). 4. Evaluate the results obtained of the application of the proposal in a real-world case study

    Mixing Methods: Practical Insights from the Humanities in the Digital Age

    Get PDF
    The digital transformation is accompanied by two simultaneous processes: digital humanities challenging the humanities, their theories, methodologies and disciplinary identities, and pushing computer science to get involved in new fields. But how can qualitative and quantitative methods be usefully combined in one research project? What are the theoretical and methodological principles across all disciplinary digital approaches? This volume focusses on driving innovation and conceptualising the humanities in the 21st century. Building on the results of 10 research projects, it serves as a useful tool for designing cutting-edge research that goes beyond conventional strategies

    Computer Vision and Architectural History at Eye Level:Mixed Methods for Linking Research in the Humanities and in Information Technology

    Get PDF
    Information on the history of architecture is embedded in our daily surroundings, in vernacular and heritage buildings and in physical objects, photographs and plans. Historians study these tangible and intangible artefacts and the communities that built and used them. Thus valuableinsights are gained into the past and the present as they also provide a foundation for designing the future. Given that our understanding of the past is limited by the inadequate availability of data, the article demonstrates that advanced computer tools can help gain more and well-linked data from the past. Computer vision can make a decisive contribution to the identification of image content in historical photographs. This application is particularly interesting for architectural history, where visual sources play an essential role in understanding the built environment of the past, yet lack of reliable metadata often hinders the use of materials. The automated recognition contributes to making a variety of image sources usable forresearch.<br/

    Computer Vision and Architectural History at Eye Level:Mixed Methods for Linking Research in the Humanities and in Information Technology

    Get PDF
    Information on the history of architecture is embedded in our daily surroundings, in vernacular and heritage buildings and in physical objects, photographs and plans. Historians study these tangible and intangible artefacts and the communities that built and used them. Thus valuableinsights are gained into the past and the present as they also provide a foundation for designing the future. Given that our understanding of the past is limited by the inadequate availability of data, the article demonstrates that advanced computer tools can help gain more and well-linked data from the past. Computer vision can make a decisive contribution to the identification of image content in historical photographs. This application is particularly interesting for architectural history, where visual sources play an essential role in understanding the built environment of the past, yet lack of reliable metadata often hinders the use of materials. The automated recognition contributes to making a variety of image sources usable forresearch.<br/

    Computer Vision and Architectural History at Eye Level:Mixed Methods for Linking Research in the Humanities and in Information Technology

    Get PDF
    Information on the history of architecture is embedded in our daily surroundings, in vernacular and heritage buildings and in physical objects, photographs and plans. Historians study these tangible and intangible artefacts and the communities that built and used them. Thus valuableinsights are gained into the past and the present as they also provide a foundation for designing the future. Given that our understanding of the past is limited by the inadequate availability of data, the article demonstrates that advanced computer tools can help gain more and well-linked data from the past. Computer vision can make a decisive contribution to the identification of image content in historical photographs. This application is particularly interesting for architectural history, where visual sources play an essential role in understanding the built environment of the past, yet lack of reliable metadata often hinders the use of materials. The automated recognition contributes to making a variety of image sources usable forresearch.<br/

    Computer Vision and Architectural History at Eye Level:Mixed Methods for Linking Research in the Humanities and in Information Technology

    Get PDF
    Information on the history of architecture is embedded in our daily surroundings, in vernacular and heritage buildings and in physical objects, photographs and plans. Historians study these tangible and intangible artefacts and the communities that built and used them. Thus valuableinsights are gained into the past and the present as they also provide a foundation for designing the future. Given that our understanding of the past is limited by the inadequate availability of data, the article demonstrates that advanced computer tools can help gain more and well-linked data from the past. Computer vision can make a decisive contribution to the identification of image content in historical photographs. This application is particularly interesting for architectural history, where visual sources play an essential role in understanding the built environment of the past, yet lack of reliable metadata often hinders the use of materials. The automated recognition contributes to making a variety of image sources usable forresearch.<br/
    corecore