79,385 research outputs found

    A Methodology for integration of heterogeneous databases

    Get PDF
    Reprint. Reprinted from IEEE transactions on knowledge and data engineering. Vol. 6, no. 6 (Dec. 1994) "December 1994."Includes bibliographical references (p. 932).Supported by the Productivity From Information Technology (PROFIT) Research Initiative at MIT.M.P. Reddy ... [et al.

    Fuzzy association rules for biological data analysis: A case study on yeast

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Last years' mapping of diverse genomes has generated huge amounts of biological data which are currently dispersed through many databases. Integration of the information available in the various databases is required to unveil possible associations relating already known data. Biological data are often imprecise and noisy. Fuzzy set theory is specially suitable to model imprecise data while association rules are very appropriate to integrate heterogeneous data.</p> <p>Results</p> <p>In this work we propose a novel fuzzy methodology based on a fuzzy association rule mining method for biological knowledge extraction. We apply this methodology over a yeast genome dataset containing heterogeneous information regarding structural and functional genome features. A number of association rules have been found, many of them agreeing with previous research in the area. In addition, a comparison between crisp and fuzzy results proves the fuzzy associations to be more reliable than crisp ones.</p> <p>Conclusion</p> <p>An integrative approach as the one carried out in this work can unveil significant knowledge which is currently hidden and dispersed through the existing biological databases. It is shown that fuzzy association rules can model this knowledge in an intuitive way by using linguistic labels and few easy-understandable parameters.</p

    Integrating data from heterogeneous DNA microarray platforms

    Get PDF
    DNA microarrays are one of the most used technologies for gene expression measurement. However, there are several distinct microarray platforms, from different manufacturers, each with its own measurement protocol, resulting in data that can hardly be compared or directly integrated. Data integration from multiple sources aims to improve the assertiveness of statistical tests, reducing the data dimensionality problem. The integration of heterogeneous DNA microarray platforms comprehends a set of tasks that range from the re-annotation of the features used on gene expression, to data normalization and batch effect elimination. In this work, a complete methodology for gene expression data integration and application is proposed, which comprehends a transcript-based re-annotation process and several methods for batch effect attenuation. The integrated data will be used to select the best feature set and learning algorithm for a brain tumor classification case study. The integration will consider data from heterogeneous Agilent and Affymetrix platforms, collected from public gene expression databases, such as The Cancer Genome Atlas and Gene Expression Omnibus.The authors thank the FCT Strategic Project of UID/BIO/04469/2013 unit, the project RECI/BBBEBI/0179/2012 (FCOMP-01-0124-FEDER-027462) and the project BioInd - Biotechnology and Bioengineering for improved Industrial and Agro-Foodprocesses”, REF.NORTE-07-0124FEDER-000028 Co-funded by the Programa Operacional Regional do Norte (ON.2 O Novo Norte), QREN, FEDER

    Creating an Intelligent System for Bankruptcy Detection: Semantic data Analysis Integrating Graph Database and Financial Ontology

    Get PDF
    In this paper, we propose a novel intelligent methodology to construct a Bankruptcy Prediction Computation Model, which is aimed to execute a company’s financial status analysis accurately. Based on the semantic data analysis and management, our methodology considers the Semantic Database System as the core of the system. It comprises three layers: an Ontology of Bankruptcy Prediction, Semantic Search Engine, and a Semantic Analysis Graph Database

    Heterogeneous data source integration for smart grid ecosystems based on metadata mining

    Get PDF
    The arrival of new technologies related to smart grids and the resulting ecosystem of applications andmanagement systems pose many new problems. The databases of the traditional grid and the variousinitiatives related to new technologies have given rise to many different management systems with several formats and different architectures. A heterogeneous data source integration system is necessary toupdate these systems for the new smart grid reality. Additionally, it is necessary to take advantage of theinformation smart grids provide. In this paper, the authors propose a heterogeneous data source integration based on IEC standards and metadata mining. Additionally, an automatic data mining framework isapplied to model the integrated information.Ministerio de Economía y Competitividad TEC2013-40767-

    Ontology-assisted database integration to support natural language processing and biomedical data-mining

    Get PDF
    Successful biomedical data mining and information extraction require a complete picture of biological phenomena such as genes, biological processes, and diseases; as these exist on different levels of granularity. To realize this goal, several freely available heterogeneous databases as well as proprietary structured datasets have to be integrated into a single global customizable scheme. We will present a tool to integrate different biological data sources by mapping them to a proprietary biomedical ontology that has been developed for the purposes of making computers understand medical natural language
    corecore