8 research outputs found

    Kroak: A metadata collection system for long term microbial community monitoring

    Get PDF
    Amplytica is start-up company whose software, called the Amplytica Cloud Platform, helps organizations determine how microbes influence their bioprocesses. Examples of such bioprocesses include anaerobic digestion, wastewater treatment and mine site reclamation. The Amplytica Cloud Platform does this by integrating and analyzing metagenomically derived microbial community data (species composition, diversity, and abundance) and industrial bioprocess data (e.g. temperature, pH, nutrients). To achieve data integration, industrial bioprocess data is considered metadata to the microbial community information and describes the environmental conditions where the microbial community is found. The capture of this industrial metadata requires a robust metadata capture system. Kroak is a metadata capture system for the Amplytica Cloud Platform that facilitates tagging per-sample microbial community information with industrial environmental metadata. It uses a modern web interface for easy deployment, Office Open XML Workbook (XLSX) template files for easy metadata capture, and metadata classes to ensure data consistency and type identification for follow on automated statistics and machine learning. Kroak is a functional metadata capture system which will be iteratively improved upon by Amplytica. Potential improvements include changes to Kroak’s data model, increasing the reliability of its metadata parsing and the expansion of its existing web application programming interface

    Kroak: A metadata collection system for long term microbial community monitoring

    Get PDF
    Amplytica is start-up company whose software, called the Amplytica Cloud Platform, helps organizations determine how microbes influence their bioprocesses. Examples of such bioprocesses include anaerobic digestion, wastewater treatment and mine site reclamation. The Amplytica Cloud Platform does this by integrating and analyzing metagenomically derived microbial community data (species composition, diversity, and abundance) and industrial bioprocess data (e.g. temperature, pH, nutrients). To achieve data integration, industrial bioprocess data is considered metadata to the microbial community information and describes the environmental conditions where the microbial community is found. The capture of this industrial metadata requires a robust metadata capture system. Kroak is a metadata capture system for the Amplytica Cloud Platform that facilitates tagging per-sample microbial community information with industrial environmental metadata. It uses a modern web interface for easy deployment, Office Open XML Workbook (XLSX) template files for easy metadata capture, and metadata classes to ensure data consistency and type identification for follow on automated statistics and machine learning. Kroak is a functional metadata capture system which will be iteratively improved upon by Amplytica. Potential improvements include changes to Kroak’s data model, increasing the reliability of its metadata parsing and the expansion of its existing web application programming interface

    XML documents schema design

    Get PDF
    The eXtensible Markup Language (XML) is fast emerging as the dominant standard for storing, describing and interchanging data among various systems and databases on the intemet. It offers schema such as Document Type Definition (DTD) or XML Schema Definition (XSD) for defining the syntax and structure of XML documents. To enable efficient usage of XML documents in any application in large scale electronic environment, it is necessary to avoid data redundancies and update anomalies. Redundancy and anomalies in XML documents can lead not only to higher data storage cost but also to increased costs for data transfer and data manipulation.To overcome this problem, this thesis proposes to establish a formal framework of XML document schema design. To achieve this aim, we propose a method to improve and simplify XML schema design by incorporating a conceptual model of the DTD with a theory of database normalization. A conceptual diagram, Graph-Document Type Definition (G-DTD) is proposed to describe the structure of XML documents at the schema level. For G- DTD itself, we define a structure which incorporates attributes, simple elements, complex elements, and relationship types among them. Furthermore, semantic constraints are also precisely defined in order to capture semantic meanings among the defined XML objects.In addition, to provide a guideline to a well-designed schema for XML documents, we propose a set of normal forms for G-DTD on the basis of rules proposed by Arenas and Libkin and Lv. et al. The corresponding normalization rules to transform from a G- DTD into a normal form schema are also discussed. A case study is given to illustrate the applicability of the concept. As a result, we found that the new normal forms are more concise and practical, in particular as they allow the user to find an 'optimal' structure of XML elements/attributes at the schema level. To prove that our approach is applicable for the database designer, we develop a prototype of XML document schema design using a Z formal specification language. Finally, using the same case study, this formal specification is tested to check for correctness and consistency of the specification. Thus, this gives a confidence that our prototype can be implemented successfully to generate an automatic XML schema design

    A new formal and analytical process to product modeling (PPM) method and its application to the precast concrete industry

    Get PDF
    The current standard product (data) modeling process relies on the experience and subjectivity of data modelers who use their experience to eliminate redundancies and identify omissions. As a result, product modeling becomes a social activity that involves iterative review processes of committees. This study aims to develop a new, formal method for deriving product models from data collected in process models of companies within an industry sector. The theoretical goals of this study are to provide a scientific foundation to bridge the requirements collection phase and the logical modeling phase of product modeling and to formalize the derivation and normalization of a product model from the processes it supports. To achieve these goals, a new and formal method, Georgia Tech Process to Product Modeling (GTPPM), has been proposed. GTPPM consists of two modules. The first module is called the Requirements Collection and Modeling (RCM) module. It provides semantics and a mechanism to define a process model, information items used by each activity, and information flow between activities. The logic to dynamically check the consistency of information flow within a process also has been developed. The second module is called the Logical Product Modeling (LPM) module. It integrates, decomposes, and normalizes information constructs collected from a process model into a preliminary product model. Nine design patterns are defined to resolve conflicts between information constructs (ICs) and to normalize the resultant model. These two modules have been implemented as a Microsoft Visio â„¢ add-on. The tool has been registered and is also called GTPPM â„¢. The method has been tested and evaluated in the precast concrete sector of the construction industry through several GTPPM modeling efforts. By using GTPPM, a complete set of information items required for product modeling for a medium or a large industry can be collected without generalizing each company's unique process into one unified high-level model. However, the use of GTPPM is not limited to product modeling. It can be deployed in several other areas including: workflow management system or MIS (Management Information System) development software specification development business process re-engineering.Ph.D.Committee Chair: Eastman, Charles M.; Committee Co-Chair: Augenbroe, Godfried; Committee Co-Chair: Navathe, Shamkant B.; Committee Member: Hardwick, Martin; Committee Member: Sacks, Rafae

    Canonical queries as a query answering device (Information Science)

    Get PDF
    Issued as Annual reports [nos. 1-2], and Final report, Project no. G-36-60

    The semantic database model as a basis for an automated database design tool

    Get PDF
    Bibliography: p.257-80.The automatic database design system is a design aid for network database creation. It obtains a requirements specification from a user and generates a prototype database. This database is compatible with the Data Definition Language of DMS 1100, the database system on the Univac 1108 at the University of Cape Town. The user interface has been constructed in such a way that a computer-naive user can submit a description of his organisation to the system. Thus it constitutes a powerful database design tool, which should greatly alleviate the designer's tasks of communicating with users, and of creating an initial database definition. The requirements are formulated using the semantic database model, and semantic information in this model is incorporated into the database as integrity constraints. A relation scheme is also generated from the specification. As a result of this research, insight has been gained into the advantages and shortcomings of the semantic database model, and some principles for 'good' data models and database design methodologies have emerged
    corecore