219,264 research outputs found

    From patterned response dependency to structured covariate dependency: categorical-pattern-matching

    Get PDF
    Data generated from a system of interest typically consists of measurements from an ensemble of subjects across multiple response and covariate features, and is naturally represented by one response-matrix against one covariate-matrix. Likely each of these two matrices simultaneously embraces heterogeneous data types: continuous, discrete and categorical. Here a matrix is used as a practical platform to ideally keep hidden dependency among/between subjects and features intact on its lattice. Response and covariate dependency is individually computed and expressed through mutliscale blocks via a newly developed computing paradigm named Data Mechanics. We propose a categorical pattern matching approach to establish causal linkages in a form of information flows from patterned response dependency to structured covariate dependency. The strength of an information flow is evaluated by applying the combinatorial information theory. This unified platform for system knowledge discovery is illustrated through five data sets. In each illustrative case, an information flow is demonstrated as an organization of discovered knowledge loci via emergent visible and readable heterogeneity. This unified approach fundamentally resolves many long standing issues, including statistical modeling, multiple response, renormalization and feature selections, in data analysis, but without involving man-made structures and distribution assumptions. The results reported here enhance the idea that linking patterns of response dependency to structures of covariate dependency is the true philosophical foundation underlying data-driven computing and learning in sciences.Comment: 32 pages, 10 figures, 3 box picture

    Towards A Novel Unified Framework for Developing Formal, Network and Validated Agent-Based Simulation Models of Complex Adaptive Systems

    Get PDF
    Literature on the modeling and simulation of complex adaptive systems (cas) has primarily advanced vertically in different scientific domains with scientists developing a variety of domain-specific approaches and applications. However, while cas researchers are inherently interested in an interdisciplinary comparison of models, to the best of our knowledge, there is currently no single unified framework for facilitating the development, comparison, communication and validation of models across different scientific domains. In this thesis, we propose first steps towards such a unified framework using a combination of agent-based and complex network-based modeling approaches and guidelines formulated in the form of a set of four levels of usage, which allow multidisciplinary researchers to adopt a suitable framework level on the basis of available data types, their research study objectives and expected outcomes, thus allowing them to better plan and conduct their respective research case studies. Firstly, the complex network modeling level of the proposed framework entails the development of appropriate complex network models for the case where interaction data of cas components is available, with the aim of detecting emergent patterns in the cas under study. The exploratory agent-based modeling level of the proposed framework allows for the development of proof-of-concept models for the cas system, primarily for purposes of exploring feasibility of further research. Descriptive agent-based modeling level of the proposed framework allows for the use of a formal step-by-step approach for developing agent-based models coupled with a quantitative complex network and pseudocode-based specification of the model, which will, in turn, facilitate interdisciplinary cas model comparison and knowledge transfer. Finally, the validated agent-based modeling level of the proposed framework is concerned with the building of in-simulation verification and validation of agent-based models using a proposed Virtual Overlay Multiagent System approach for use in a systematic team-oriented approach to developing models. The proposed framework is evaluated and validated using seven detailed case study examples selected from various scientific domains including ecology, social sciences and a range of complex adaptive communication networks. The successful case studies demonstrate the potential of the framework in appealing to multidisciplinary researchers as a methodological approach to the modeling and simulation of cas by facilitating effective communication and knowledge transfer across scientific disciplines without the requirement of extensive learning curves

    Applying the UML and the Unified Process to the Design of Data Warehouses

    Get PDF
    The design, development and deployment of a data warehouse (DW) is a complex, time consuming and prone to fail task. This is mainly due to the different aspects taking part in a DW architecture such as data sources, processes responsible for Extracting, Transforming and Loading (ETL) data into the DW, the modeling of the DW itself, specifying data marts from the data warehouse or designing end user tools. In the last years, different models, methods and techniques have been proposed to provide partial solutions to cover the different aspects of a data warehouse. Nevertheless, none of these proposals addresses the whole development process of a data warehouse in an integrated and coherent manner providing the same notation for the modeling of the different parts of a DW. In this paper, we propose a data warehouse development method, based on the Unified Modeling Language (UML) and the Unified Process (UP), which addresses the design and development of both the data warehouse back-stage and front-end. We use the extension mechanisms (stereotypes, tagged values and constraints) provided by the UML and we properly extend it in order to accurately model the different parts of a data warehouse (such as the modeling of the data sources, ETL processes or the modeling of the DW itself) by using the same notation. To the best of our knowledge, our proposal provides a seamless method for developing data warehouses. Finally, we apply our approach to a case study to show its benefit.This work has been partially supported by the METASIGN project (TIN2004-OO779) from the Spanish Ministry of Education and Science, by the DADASMECA project (GV05/220) from the Valencia Government, and by the DADS (PBC-05-QI 2-2) project from the Regional Science arid Technology Ministry of CastiIla-La Mancha (Spain)

    Preference Dissemination by Sharing Viewpoints: Simulating Serendipity

    Get PDF
    IC3K 2015 will be held in conjunction with IJCCI 2015International audienceThe Web currently stores two types of content. These contents include linked data from the semantic Web and user contributions from the social Web. Our aim is to represent simplified aspects of these contents within a unified topological model and to harvest the benefits of integrating both content types in order to prompt collective learning and knowledge discovery. In particular, we wish to capture the phenomenon of Serendipity (i.e., incidental learning) using a subjective knowledge representation formalism, in which several " viewpoints " are individually interpretable from a knowledge graph. We prove our own Viewpoints approach by evidencing the collective learning capacity enabled by our approach. To that effect, we build a simulation that disseminates knowledge with linked data and user contributions, similar to the way the Web is formed. Using a behavioral model configured to represent various Web navigation strategies, we seek to optimize the distribution of preference systems. Our results outline the most appropriate strategies for incidental learning, bringing us closer to understanding and modeling the processes involved in Serendipity. An implementation of the Viewpoints formalism kernel is available. The underlying Viewpoints model allows us to abstract and generalize our current proof of concept for the indexing of any type of data set

    Exploiting data semantics to discover, extract, and model web sources

    Get PDF
    We describe DEIMOS, a system that automatically discovers and models new sources of information. The system exploits four core technologies developed by our group that makes an end-to-end solution to this problem possible. First, given an example source, DEIMOS finds other similar sources online. Second, it invokes and extracts data from these sources. Third, given the syntactic structure of a source, DEIMOS maps its inputs and outputs to semantic types. Finally, it infers the source’s semantic definition, i.e., the function that maps the inputs to the outputs. DEIMOS is able to successfully automate these steps by exploiting a combination of background knowledge and data semantics. We describe the challenges in integrating separate components into a unified approach to discovering, extracting and modeling new online sources. We provide an end-toend validation of the system in two information domains to show that it can successfully discover and model new data sources in those domains. 1

    Design of a Multidimensional Model Using Object Oriented Features in UML

    Get PDF
    A data warehouse is a single repository of data which includes data generated from various operational systems. Conceptual modeling is an important concept in the successful design of a data warehouse. The Unified Modeling Language (UML) has become a standard for object modeling during analysis and design steps of software system development. The paper proposes an object oriented approach to model the process of data warehouse design. The hierarchies of each data element can be explicitly defined, thus highlighting the data granularity. We propose a UML multidimensional model using various data sources based on UML schemas. We present a conceptual-level integration framework on diverse UML data sources on which OLAP operations can be performed. Our integration framework takes into account the benefits of UML (its concepts, relationships and extended features) which is more close to the real world and can model even the complex problems easily and accurately. Two steps are involved in our integration framework. The first one is to convert UML schemas into UML class diagrams. The second is to build a multidimensional model from the UML class diagrams. The white-paper focuses on the transformations used in the second step. We describe how to represent a multidimensional model using a UML star or snowflake diagram with the help of a case study. To the best of our knowledge, we are the first people to represent a UML snowflake diagram that integrates heterogeneous UML data sources

    Joint Causal Inference from Multiple Contexts

    Get PDF
    The gold standard for discovering causal relations is by means of experimentation. Over the last decades, alternative methods have been proposed that can infer causal relations between variables from certain statistical patterns in purely observational data. We introduce Joint Causal Inference (JCI), a novel approach to causal discovery from multiple data sets from different contexts that elegantly unifies both approaches. JCI is a causal modeling framework rather than a specific algorithm, and it can be implemented using any causal discovery algorithm that can take into account certain background knowledge. JCI can deal with different types of interventions (e.g., perfect, imperfect, stochastic, etc.) in a unified fashion, and does not require knowledge of intervention targets or types in case of interventional data. We explain how several well-known causal discovery algorithms can be seen as addressing special cases of the JCI framework, and we also propose novel implementations that extend existing causal discovery methods for purely observational data to the JCI setting. We evaluate different JCI implementations on synthetic data and on flow cytometry protein expression data and conclude that JCI implementations can considerably outperform state-of-the-art causal discovery algorithms.Comment: Final version, as published by JML
    • …
    corecore