58,152 research outputs found

    Multi-Information Source Fusion and Optimization to Realize ICME: Application to Dual Phase Materials

    Get PDF
    Integrated Computational Materials Engineering (ICME) calls for the integration of computational tools into the materials and parts development cycle, while the Materials Genome Initiative (MGI) calls for the acceleration of the materials development cycle through the combination of experiments, simulation, and data. As they stand, both ICME and MGI do not prescribe how to achieve the necessary tool integration or how to efficiently exploit the computational tools, in combination with experiments, to accelerate the development of new materials and materials systems. This paper addresses the first issue by putting forward a framework for the fusion of information that exploits correlations among sources/models and between the sources and `ground truth'. The second issue is addressed through a multi-information source optimization framework that identifies, given current knowledge, the next best information source to query and where in the input space to query it via a novel value-gradient policy. The querying decision takes into account the ability to learn correlations between information sources, the resource cost of querying an information source, and what a query is expected to provide in terms of improvement over the current state. The framework is demonstrated on the optimization of a dual-phase steel to maximize its strength-normalized strain hardening rate. The ground truth is represented by a microstructure-based finite element model while three low fidelity information sources---i.e. reduced order models---based on different homogenization assumptions---isostrain, isostress and isowork---are used to efficiently and optimally query the materials design space.Comment: 19 pages, 11 figures, 5 table

    Use of an object-based system with reasoning capabilities to integrate relational databases

    Get PDF
    The integration of heterogeneous and autonomous information sources is a requirement for the new type of cooperative information systems. In this paper we show the advantages of using a terminological system for integrating pre-existing relational databases. From the resulting integrated schema point of view, using · a terminological system allows for the definition of semantically richer integrated schema. From the integrated schema generation process point of view, the use of a terminological system permits the definition of a more consistent, broad and automatic process. Last, from the query processing point of view, terminological systems provide interesting features for incorporating semantic and caching query optimization techniques. The advantages are presented in detail for each main step of the integration process: translation, integration and query processing

    Completeness of Information Sources

    Get PDF
    Information quality plays a crucial role in every application that integrates data from autonomous sources. However, information quality is hard to measure and complex to consider for the tasks of information integration, even if the integrating sources cooperate. We present a systematic and formal approach to the measurement of information quality and the combination of such measurements for information integration. Our approach is based on a value model that incorporates both extensional value (coverage) and intensional value (density) of information. For both aspects we provide merge functions for adequately scoring integrated results. Also, we combine the two criteria to an overall completeness criterion that formalizes the intuitive notion of completeness of query results. This completeness measure is a valuable tool to assess source size and to predict result sizes of queries in integrated information systems. We propose this measure as an important step towards the usage of information quality for source selection, query planning, query optimization, and quality feedback to users.Peer Reviewe

    Query optimization in XML based information integration for queries involving aggregation and group by.

    Get PDF
    This thesis addresses optimization and processing of queries involving aggregation and group-by in the semantic-model approach to information integration. Query processing algorithms materialization, subqueries, and wrapper have been extended for such aggregate queries. Algorithms have been presented for two cases: In the first case information at different sources are disjoint, while in the second case information sources may contain overlapping information

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    Algebraic rewritings for optimizing regular path queries

    Get PDF
    AbstractRewriting queries using views is a powerful technique that has applications in query optimization, data integration, data warehousing, etc. Query rewriting in relational databases is by now rather well investigated. However, in the framework of semistructured data the problem of rewriting has received much less attention. In this paper we focus on extracting as much information as possible from algebraic rewritings for the purpose of optimizing regular path queries. The cases when we can find a complete exact rewriting of a query using a set a views are very “ideal”. However, there is always information available in the views, even if this information is only partial. We introduce “lower” and “possibility” partial rewritings and provide algorithms for computing them. These rewritings are algebraic in their nature, i.e. we use only the algebraic view definitions for computing the rewritings. We do not use any pairs (tuples) of objects for computing the rewritings. This fact makes them a main memory product, which can be used for reducing secondary memory and remote access. After the main memory algebraic computation of the rewritings there is a second phase, with secondary memory access, for deriving the pairs of objects in the query answer. We give two algorithms for utilizing the partial lower and partial possibility rewritings to decrease the number of secondary memory accesses

    Constraint-based Query Distribution Framework for an Integrated Global Schema

    Full text link
    Distributed heterogeneous data sources need to be queried uniformly using global schema. Query on global schema is reformulated so that it can be executed on local data sources. Constraints in global schema and mappings are used for source selection, query optimization,and querying partitioned and replicated data sources. The provided system is all XML-based which poses query in XML form, transforms, and integrates local results in an XML document. Contributions include the use of constraints in our existing global schema which help in source selection and query optimization, and a global query distribution framework for querying distributed heterogeneous data sources.Comment: The Proceedings of the 13th INMIC 2009), Dec. 14-15, 2009, Islamabad, Pakistan. Pages 1 - 6 Print ISBN: 978-1-4244-4872-2 INSPEC Accession Number: 11072575 Date of Current Version : 15 January 201
    • …
    corecore