128 research outputs found

    Towards exploratory hypothesis testing and analysis

    Get PDF
    10.1109/ICDE.2011.5767907Proceedings - International Conference on Data Engineering745-75

    Exploring and linking biomedical resources through multidimensional semantic spaces

    Get PDF
    Background The semantic integration of biomedical resources is still a challenging issue which is required for effective information processing and data analysis. The availability of comprehensive knowledge resources such as biomedical ontologies and integrated thesauri greatly facilitates this integration effort by means of semantic annotation, which allows disparate data formats and contents to be expressed under a common semantic space. In this paper, we propose a multidimensional representation for such a semantic space, where dimensions regard the different perspectives in biomedical research (e.g., population, disease, anatomy and protein/genes). Results This paper presents a novel method for building multidimensional semantic spaces from semantically annotated biomedical data collections. This method consists of two main processes: knowledge and data normalization. The former one arranges the concepts provided by a reference knowledge resource (e.g., biomedical ontologies and thesauri) into a set of hierarchical dimensions for analysis purposes. The latter one reduces the annotation set associated to each collection item into a set of points of the multidimensional space. Additionally, we have developed a visual tool, called 3D-Browser, which implements OLAP-like operators over the generated multidimensional space. The method and the tool have been tested and evaluated in the context of the Health-e-Child (HeC) project. Automatic semantic annotation was applied to tag three collections of abstracts taken from PubMed, one for each target disease of the project, the Uniprot database, and the HeC patient record database. We adopted the UMLS Meta-thesaurus 2010AA as the reference knowledge resource. Conclusions Current knowledge resources and semantic-aware technology make possible the integration of biomedical resources. Such an integration is performed through semantic annotation of the intended biomedical data resources. This paper shows how these annotations can be exploited for integration, exploration, and analysis tasks. Results over a real scenario demonstrate the viability and usefulness of the approach, as well as the quality of the generated multidimensional semantic spaces

    SynopSys: Foundations for Multidimensional Graph Analytics

    Get PDF
    The past few years have seen a tremendous increase in often irregularly structured data that can be represented most naturally and efficiently in the form of graphs. Making sense of incessantly growing graphs is not only a key requirement in applications like social media analysis or fraud detection but also a necessity in many traditional enterprise scenarios. Thus, a flexible approach for multidimensional analysis of graph data is needed. Whereas many existing technologies require up-front modelling of analytical scenarios and are difficult to adapt to changes, our approach allows for ad-hoc analytical queries of graph data. Extending our previous work on graph summarization, in this position paper we lay the foundation for large graph analytics to enable business intelligence on graph-structured data

    Integration of business intelligence based on three-level ontology services

    Full text link
    Usually, integration of business intelligence (BI) from realistic telecom enterprise is by packing data warehouse (DW), OLAP, data mining and reporting from different vendors together. As a result, BI system users are transferred to a reporting system with reports, data models, dimensions and measures predefined by system designers. As a result of survey, 85% of DW projects failed to meet their intended objectives. In this paper, we investigate how to integrate BI packages into an adaptive and flexible knowledge portal by constructing an internal link and communication channel from top-level business concepts to underlying enterprise information systems (EIS). An approach of three-level ontology services is developed, which implements unified naming, directory and transport of ontology services, and ontology mapping and query parsing among conceptual view, analytical view and physical view from user interfaces through DW to EIS. Experiments on top of real telecom EIS shows that our solution for integrating BI presents much stronger power to support operational decision making more user-friendly and adoptively compared with those simply combining BI products presently available together. © 2004 IEEE

    Developing A New Decision Support System for University Student Recruitment

    Get PDF
    This paper investigates the practical issues surrounding the development and implementation of Decision Support Systems (DSS). The paper describes the traditional development approaches analyzing their drawbacks and introduces a new DSS development methodology. The proposed DSS methodology is based upon four modules; needs’ analysis, data warehouse (DW), knowledge discovery in database (KDD), and a DSS module. The proposed DSS methodology is applied to and evaluated using the admission and registration functions in Egyptian Universities. The paper investigates the organizational requirements that are required to underpin these functions in Egyptian Universities. These requirements have been identified following an in-depth survey of the recruitment process in the Egyptian Universities. This survey employed a multi-part admission and registration DSS questionnaire (ARDSSQ) to identify the required data sources together with the likely users and their information needs. The questionnaire was sent to senior managers within the Egyptian Universities (both private and government) with responsibility for student recruitment, in particular admission and registration. Further, access to a large database has allowed the evaluation of the practical suitability of using a DW structure and knowledge management tools within the decision making framework. 2000 records have been used to build and test the data mining techniques within the KDD process. The records were drawn from the Arab Academy for Science and Technology and Maritime Transport (AASTMT) students’ database (DB). Moreover, the paper has analyzed the key characteristics of DW and explored the advantages and disadvantages of such data structures. This evaluation has been used to build a DW for the Egyptian Universities that handle their admission and registration related archival data. The decision makers’ potential benefits of the DW within the student recruitment process will be explored. The design of the proposed admission and registration DSS (ARDSS) will be developed and tested using Cool: Gen (5.0) CASE tools by Computer Associates (CA), connected to a MS-SQL Server (6.5), in a Windows NT (4.0) environment. Crystal Reports (4.6) by Seagate will be used as a report generation tool. CLUSTAN Graphics (5.0) by CLUSTAN software will also be used as a clustering package. The ARDSS software could be adjusted for usage in different countries for the same purpose, it is also scalable to handle new decision situations and can be integrated with other systems

    Data Mining; A Conceptual Overview

    Get PDF
    This tutorial provides an overview of the data mining process. The tutorial also provides a basic understanding of how to plan, evaluate and successfully refine a data mining project, particularly in terms of model building and model evaluation. Methodological considerations are discussed and illustrated. After explaining the nature of data mining and its importance in business, the tutorial describes the underlying machine learning and statistical techniques involved. It describes the CRISP-DM standard now being used in industry as the standard for a technology-neutral data mining process model. The paper concludes with a major illustration of the data mining process methodology and the unsolved problems that offer opportunities for research. The approach is both practical and conceptually sound in order to be useful to both academics and practitioners

    A framework for automated association mining over multiple databases

    Get PDF
    Literature on association mining, the data mining methodology that investigates associations between items, has primarily focused on efficiently mining larger databases. The motivation for association mining is to use the rules obtained from historical data to influence future transactions. However, associations in transactional processes change significantly over time, implying that rules extracted for a given time interval may not be applicable for a later time interval. Hence, an analysis framework is necessary to identify how associations change over time. This paper presents such a framework, reports the implementation of the framework as a tool, and demonstrates the applicability of and the necessity for the framework through a case study in the domain of finance
    corecore