41 research outputs found

    Genomic data analysis using grid-based computing

    Get PDF
    Microarray experiments generate a plethora of genomic data; therefore we need techniques and architectures to analyze this data more quickly. This thesis presents a solution for reducing the computation time of a highly computationally intensive data analysis part of a genomic application. The application used is the Stanford Microarray Database (SMD). SMD\u27s implementation, working, and analysis features are described. The reasons for choosing the computationally intensive problems of the SMD, and the background importance of these problems are presented. This thesis presents an effective parallel solution to the computational problem, including the difficulties faced with the parallelization of the problem and the results achieved. Finally, future research directions for achieving even greater speedups are presented

    Dynamics of domain coverage of the protein sequence universe

    Get PDF
    Background The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its “dark matter”. Results Here we suggest that true size of “dark matter” is much larger than stated by current definitions. We propose an approach to reducing the size of “dark matter” by identifying and subtracting regions in protein sequences that are not likely to contain any domain. Conclusions Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of “dark matter”; however, its absolute size increases substantially with the growth of sequence data

    Visualization of the modeled degradation of building flooring systems in building maintenance

    Get PDF
    The development of a maintenance programme for construction projects is a highly complex and data intensive undertaking. This exercise is characterised by the lack of relevant data on the one hand and the overwhelming amount of extraneous data on the other. The uncertainties and complexities have resulted in increased conservatism in the development of lifecycle evaluation of building maintenance programing, subsequently, these programmes tend to display the symptoms of either the maintenance actions being uneconomical or fall short of providing the appropriate service to the users of the building. The current research project is based on the premise that the visual approach will facilitate a just-in-time solution to maintenance scheduling, hence, the use of virtual simulation of the building is proposed. The broader aim of this research is to develop a complete building maintenance programme through visualisation of buildings as they degrade over time. Here, the focus is on the flooring system and the manner they degrade over time. This requires a better understanding of their pattern and rate of usage. To this end, Anthroposophy and Anthropocentric descriptions of human movement pattern have been used to describe the behaviour of 'subjects' and subsequently represent the pattern and density of the degradation of flooring systems. The mathematics representing this behaviour has been developed which enables it to be embedded into the proposed overall visual building maintenance model

    Accelerated Profile HMM Searches

    Get PDF
    Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call “sparse rescaling”. These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches

    The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text

    Get PDF
    BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them.RESULTS:A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89 and the best AUC iP/R was 68. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35) the macro-averaged precision ranged between 50 and 80, with a maximum F-Score of 55. CONCLUSIONS: The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows

    Characterizing the Role of Lamin B1 Acetylation in Regulating the Integrity of the Nuclear Periphery

    No full text
    The beta-herpesvirus human cytomegalovirus (HCMV) is a ubiquitous pathogen that causes critical diseases in pregnant women and immunocompromised patients, yet effective vaccines and antiviral treatments remain elusive. HCMV relies on the spatiotemporal remodeling of host cell organization and the rewiring of organelle and protein functions for virus replication and spread. A key subcellular location for these processes is the nucleus, where viral genome replication, gene expression, DNA packaging, and capsid assembly all occur. Evidence points to the acetylation of host immune proteins as an important post-translational modification in regulating their subcellular localization and/or protein-protein interactions and functions during viral infection. Additionally, our lab recently showed that acetylation of nuclear laminar proteins inhibits viral replication through the maintenance of nuclear integrity. This was primarily mediated by a site-specific acetylation of lamin B1 (LMNB1), a core component of laminar structure, at lysine 134 (K134). In my thesis, I utilize molecular virology, mass spectrometry-based proteomics, and confocal microscopy to further characterize the function of LMNB1 acetylation during HCMV infection. I find that acetylation at K134 increases LMNB1 association with nuclear periphery proteins, impairs viral genome replication, and inhibits the production of viral proteins needed for DNA packaging. I also expand our understanding of acetylated LMNB1 functions by demonstrating that site-specific acetylations can toggle LMBN1 between an anti-viral (residues K261, K389) and pro-viral (K102) state. Finally, I show that the role of LMNB1 acetylation in stabilizing nuclear integrity broadly impacts cell biology by characterizing a novel function in regulating the G1/S checkpoint of the cell cycle. Taken together, this thesis further 10 demonstrates novel, critical roles for LMNB1 acetylation in the tug-of-war between viruses and their hosts and uncovers LMNB1 acetylation as a regulatory mechanism in cell cycle progression

    Characterizing the Role of Lamin B1 Acetylation in Regulating the Integrity of the Nuclear Periphery

    No full text
    The beta-herpesvirus human cytomegalovirus (HCMV) is a ubiquitous pathogen that causes critical diseases in pregnant women and immunocompromised patients, yet effective vaccines and antiviral treatments remain elusive. HCMV relies on the spatiotemporal remodeling of host cell organization and the rewiring of organelle and protein functions for virus replication and spread. A key subcellular location for these processes is the nucleus, where viral genome replication, gene expression, DNA packaging, and capsid assembly all occur. Evidence points to the acetylation of host immune proteins as an important post-translational modification in regulating their subcellular localization and/or protein-protein interactions and functions during viral infection. Additionally, our lab recently showed that acetylation of nuclear laminar proteins inhibits viral replication through the maintenance of nuclear integrity. This was primarily mediated by a site-specific acetylation of lamin B1 (LMNB1), a core component of laminar structure, at lysine 134 (K134). In my thesis, I utilize molecular virology, mass spectrometry-based proteomics, and confocal microscopy to further characterize the function of LMNB1 acetylation during HCMV infection. I find that acetylation at K134 increases LMNB1 association with nuclear periphery proteins, impairs viral genome replication, and inhibits the production of viral proteins needed for DNA packaging. I also expand our understanding of acetylated LMNB1 functions by demonstrating that site-specific acetylations can toggle LMBN1 between an anti-viral (residues K261, K389) and pro-viral (K102) state. Finally, I show that the role of LMNB1 acetylation in stabilizing nuclear integrity broadly impacts cell biology by characterizing a novel function in regulating the G1/S checkpoint of the cell cycle. Taken together, this thesis further 10 demonstrates novel, critical roles for LMNB1 acetylation in the tug-of-war between viruses and their hosts and uncovers LMNB1 acetylation as a regulatory mechanism in cell cycle progression

    Discrete-event simulation based virtual reality environments for construction operations

    No full text
    Discrete-event simulation (DES) is a quantitative technique that can significantly improve the analysis and design of construction operations. As the complexity of an operation increases, so does the utility of modeling it using DES and the need for enhanced capabilities in the information technology that is required. At the onset of this research, the state-of-the-art allowed engineers to (a) model very complex construction operations using DES, and (b) photo-realistically animate previously simulated operations with temporal and spatial accuracy in 3D. It was not possible for an engineer who was experiencing an animation to interact with it in ways that could affect the course of the remaining events in the simulation. As a consequence, the engineer could not test the response of a DES model to the spontaneous curiosity that often comes about while viewing an animation. This state of technological advance limited the ability of engineers to validate complex construction operations, and contributed to limit the credibility of DES studies. The research presented in this dissertation advanced the state-of-the-art so that now engineers can: (a) run animations and their simulations concurrently, and (b) interact with animations to change the course of events in their simulations. These advances make it possible to create essentially what is a Virtual Reality environment with the underlying logic defined by a sophisticated DES model where engineers can study/visualize the model\u27s reaction to events introduced by them while experiencing the concurrently run animation. This can provide construction engineers with an increased understanding of the underlying DES model, simulation experiments that can/should be conducted, and possibly insights about the underlying operation itself. In short, these advances allow experts and decision-makers to get a clear understanding of what is and what is not modeled—a necessary condition for achieving credibility. The tangible product from this research, in combination with DES and 3D animation, is similar to a 3D gaming toolkit with the important difference being the serious nature of the underlying logic supported. This makes it possible to develop high-impact \u27serious games\u27 of construction that can be used for teaching and training at all levels
    corecore