7 research outputs found

    Big Data Redux: New Issues and Challenges Moving Forward

    Get PDF
    As of the time of this writing, our HICSS-46 proceedings article has enjoyed over 520 Google Scholar citations. We have published several HICSS proceedings, articles and a book on this subject, but none of them have generated this level of interest. In an effort to update our findings six years later, and to understand what is driving this interest, we have downloaded the first 500 citations to our article and the corresponding citing article, when available. We conducted an in-depth literature review of the articles published in top journals and leading conference proceedings, along with articles with a high volume of citations. This paper provides a brief summary of the key concepts in our original paper and reports on the key aspects of interest we found in our review, and also updates our original paper with new directions for future practice and research in big data and analytics

    Intership Report on data merging at the bank of Portugal Internship Experience at the Bank of Portugal: A Comprehensive Dive into Full Stack Development - Leveraging Modern Technology to Innovate Financial Infrastructure and Enhance User Experience

    Get PDF
    Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThis report details my full-stack development internship experiences at the Bank of Portugal, with a particular emphasis on the creation of a website intended to increase operational effectiveness in the DAS Department. My main contributions met a clear need, which was the absence of a reliable platform that could manage and combine data from many sources. I was actively involved in creating functionality for the Django applications Integrator and BAII using Django, a high-level Python web framework. Several problems were addressed by the distinctive features I planned and programmed, including daily data extraction from several SQL databases, entity error detection, data merging, and user-friendly interfaces for data manipulation. A feature that enables the attribution of litigation to certain entities was also developed. The outcomes of the developed features have proven to be useful, giving the Institutional Intervention Area, the Sanctioning Action Area, the Illicit Financial Activity Investigation Area, and the Money Laundering Preventive Supervision Area for Capital and Financing of Terrorism tools to carry out their duties more effectively. The full-stack development approaches' advancement and use in the banking industry, notably in data management and web application development, have been aided by this internship experience

    A System for Converting and Recovering Texts Managed as Structured Information

    Get PDF
    This paper introduces a system that incorporates several strategies based on scientific models of how the brain records and recovers memories. Methodologically, an incremental prototyping approach has been applied to develop a satisfactory architecture that can be adapted to any language. A special case is studied and tested regarding the Spanish language. The applications of this proposal are vast because, in general, information such as text way, reports, emails, and web content, among others, is considered unstructured and, hence, the repositories based on SQL databases usually do not handle this kind of data correctly and efficiently. The conversion of unstructured textual information to structured one can be useful in contexts such as Natural Language Generation, Data Mining, and dynamic generation of theories, among others

    Linkage scenarios of relational databases and ontologies: a systematic mapping

    Get PDF
    Relational databases are one of the most used data sources. However, as a storage source, they present a group of shortcomings. It is complex to store semantic knowledge in relational databases. To solve the deficiencies in knowledge representation of relational databases, one trend has been to use ontologies. Ontologies possess a richer semantic and are closer to the end user vocabulary than relational database schemas. The objective of the present research was to carry out a systematic mapping about the scenarios where relational databases and ontologies are linked to provide a better integration, query, and visualization of stored data. The mapping was carried out by applying a methodological proposal established in the literature. As outcomes of the research, it was detected that the mapping of relational databases to ontologies and the ontologies usage for the integration of heterogeneous data sources were the most common scenarios. Likewise, trends and challenges were identified in each scenario, which might deserve further research efforts in the future

    Knowledge hypergraph based-approach for multi-source data integration and querying : Application for Earth Observation domain

    Get PDF
    Early warning against natural disasters to save lives and decrease damages has drawn increasing interest to develop systems that observe, monitor, and assess the changes in the environment. Over the last years, numerous environmental monitoring systems and Earth Observation (EO) programs were implemented. Nevertheless, these systems generate a large amount of EO data while using different vocabularies and different conceptual schemas. Accordingly, data resides in many siloed systems and are mainly untapped for integrated operations, insights, and decision making situations. To overcome the insufficient exploitation of EO data, a data integration system is crucial to break down data silos and create a common information space where data will be semantically linked. Within this context, we propose a semantic data integration and querying approach, which aims to semantically integrate EO data and provide an enhanced query processing in terms of accuracy, completeness, and semantic richness of response. . To do so, we defined three main objectives. The first objective is to capture the knowledge of the environmental monitoring domain. To do so, we propose MEMOn, a domain ontology that provides a common vocabulary of the environmental monitoring domain in order to support the semantic interoperability of heterogeneous EO data. While creating MEMOn, we adopted a development methodology, including three fundamental principles. First, we used a modularization approach. The idea is to create separate modules, one for each context of the environment domain in order to ensure the clarity of the global ontology’s structure and guarantee the reusability of each module separately. Second, we used the upper-level ontology Basic Formal Ontology and the mid-level ontologies, the Common Core ontologies, to facilitate the integration of the ontological modules in order to build the global one. Third, we reused existing domain ontologies such as ENVO and SSN, to avoid creating the ontology from scratch, and this can improve its quality since the reused components have already been evaluated. MEMOn is then evaluated using real use case studies, according to the Sahara and Sahel Observatory experts’ requirements. The second objective of this work is to break down the data silos and provide a common environmental information space. Accordingly, we propose a knowledge hypergraphbased data integration approach to provide experts and software agents with a virtual integrated and linked view of data. This approach generates RML mappings between the developed ontology and metadata and then creates a knowledge hypergraph that semantically links these mappings to identify more complex relationships across data sources. One of the strengths of the proposed approach is it goes beyond the process of combining data retrieved from multiple and independent sources and allows the virtual data integration in a highly semantic and expressive way, using hypergraphs. The third objective of this thesis concerns the enhancement of query processing in terms of accuracy, completeness, and semantic richness of response in order to adapt the returned results and make them more relevant and richer in terms of relationships. Accordingly, we propose a knowledge-hypergraph based query processing that improves the selection of sources contributing to the final result of an input query. Indeed, the proposed approach moves beyond the discovery of simple one-to-one equivalence matches and relies on the identification of more complex relationships across data sources by referring to the knowledge hypergraph. This enhancement significantly showcases the increasing of answer completeness and semantic richness. The proposed approach was implemented in an open-source tool and has proved its effectiveness through a real use case in the environmental monitoring domain

    Proceedings of the 10th International Conference on Ecological Informatics: translating ecological data into knowledge and decisions in a rapidly changing world: ICEI 2018

    Get PDF
    The Conference Proceedings are an impressive display of the current scope of Ecological Informatics. Whilst Data Management, Analysis, Synthesis and Forecasting have been lasting popular themes over the past nine biannual ICEI conferences, ICEI 2018 addresses distinctively novel developments in Data Acquisition enabled by cutting edge in situ and remote sensing technology. The here presented ICEI 2018 abstracts captures well current trends and challenges of Ecological Informatics towards: • regional, continental and global sharing of ecological data, • thorough integration of complementing monitoring technologies including DNA-barcoding, • sophisticated pattern recognition by deep learning, • advanced exploration of valuable information in ‘big data’ by means of machine learning and process modelling, • decision-informing solutions for biodiversity conservation and sustainable ecosystem management in light of global changes

    Proceedings of the 10th International Conference on Ecological Informatics: translating ecological data into knowledge and decisions in a rapidly changing world: ICEI 2018

    Get PDF
    The Conference Proceedings are an impressive display of the current scope of Ecological Informatics. Whilst Data Management, Analysis, Synthesis and Forecasting have been lasting popular themes over the past nine biannual ICEI conferences, ICEI 2018 addresses distinctively novel developments in Data Acquisition enabled by cutting edge in situ and remote sensing technology. The here presented ICEI 2018 abstracts captures well current trends and challenges of Ecological Informatics towards: • regional, continental and global sharing of ecological data, • thorough integration of complementing monitoring technologies including DNA-barcoding, • sophisticated pattern recognition by deep learning, • advanced exploration of valuable information in ‘big data’ by means of machine learning and process modelling, • decision-informing solutions for biodiversity conservation and sustainable ecosystem management in light of global changes
    corecore