39 research outputs found

    Biclustering via optimal re-ordering of data matrices in systems biology: rigorous methods and comparative studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The analysis of large-scale data sets via clustering techniques is utilized in a number of applications. Biclustering in particular has emerged as an important problem in the analysis of gene expression data since genes may only jointly respond over a subset of conditions. Biclustering algorithms also have important applications in sample classification where, for instance, tissue samples can be classified as cancerous or normal. Many of the methods for biclustering, and clustering algorithms in general, utilize simplified models or heuristic strategies for identifying the "best" grouping of elements according to some metric and cluster definition and thus result in suboptimal clusters.</p> <p>Results</p> <p>In this article, we present a rigorous approach to biclustering, OREO, which is based on the Optimal RE-Ordering of the rows and columns of a data matrix so as to globally minimize the dissimilarity metric. The physical permutations of the rows and columns of the data matrix can be modeled as either a network flow problem or a traveling salesman problem. Cluster boundaries in one dimension are used to partition and re-order the other dimensions of the corresponding submatrices to generate biclusters. The performance of OREO is tested on (a) metabolite concentration data, (b) an image reconstruction matrix, (c) synthetic data with implanted biclusters, and gene expression data for (d) colon cancer data, (e) breast cancer data, as well as (f) yeast segregant data to validate the ability of the proposed method and compare it to existing biclustering and clustering methods.</p> <p>Conclusion</p> <p>We demonstrate that this rigorous global optimization method for biclustering produces clusters with more insightful groupings of similar entities, such as genes or metabolites sharing common functions, than other clustering and biclustering algorithms and can reconstruct underlying fundamental patterns in the data for several distinct sets of data matrices arising in important biological applications.</p

    A decision support platform for the configuration of biomass based production systems

    No full text
    The depletion of fossil resources and the ever increasing demand on the reduction of GHG emissions have driven the process industry towards incorporating renewable feedstock, i.e. biomass, for producing energy, chemicals, and materials. Given the vast amount of possibilities in selecting the types of biomass feedstock, conversion technologies, and target products, a systematic approach is highly desirable to support the decision making process. Along this direction, tools including simulation analysis and process synthesis and integration have already been applied to the investigation and optimisation of individual plants or processes for biomass conversion. As opposed to the study of individual processes, the Department of Environment, Food and Rural Area (DEFRA) in UK has supported the development of a systematic approach to screen biomass feedstocks and design options that may involve multiple interconnected processes. In this work, a Unit of Synthesis (UoS), typically representing a complete conversion process, e.g. that of bioethanol or biodiesel production, can be connected with other UoS to form a production chain, which is referred to as a processing route. An example of a two-step processing route is the production of ethanol by fermentation (UoS 1) followed by a dehydration process (UoS 2) to eventually produce ethylene. Furthermore, multiple feasible routes, often involving common processing steps or UoS, may be formulated into a superstructure to represent all the possibilities for converting a set of feedstocks to a set of products. The optimisation of the superstructure against economical, energy, and/or GHG objectives will allow to systematically identify promising configuration(s) of the production system, represented by the best combination(s) of chosen feedstocks, processing steps, and products. The above functionalities are currently being implemented in a systems platform to meet different levels of industrial needs. The platform consists of a central spreadsheet-based database, a spreadsheet-based simulation tool to allow the evaluation of pre-defined processing routes, and a GAMS-based synthesis tool to perform superstrcuture-based optimal screening/configuration. The contral database holds data of individual Units of Synthesis and other common data such as prices of chemicals/utilities. This database is shared by both the simulation tool and the synthesis tool. The DEFRA-funded project aims to study eight chemicals, which include two key chemicals namely ethylene and propylene glycol. A number of routes have been identified for producing each of the two key chemicals. The systems platform will be tested in the first pace by case studies addressing these two chemicals separately or jointly

    Jockeys T and J. Mortimer at the Ascot Races, 7 July 1934 [picture].

    No full text
    Title devised from accompanying information where available.; Part of the: Fairfax archive of glass plate negatives.; Fairfax number: 2369.; Also available online at: http://nla.gov.au/nla.pic-vn6216793; Acquired from Fairfax Media, 2012
    corecore