107 research outputs found

    The Firegoose: two-way integration of diverse data from different bioinformatics web resources with desktop applications

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Information resources on the World Wide Web play an indispensable role in modern biology. But integrating data from multiple sources is often encumbered by the need to reformat data files, convert between naming systems, or perform ongoing maintenance of local copies of public databases. Opportunities for new ways of combining and re-using data are arising as a result of the increasing use of web protocols to transmit structured data.</p> <p>Results</p> <p>The Firegoose, an extension to the Mozilla Firefox web browser, enables data transfer between web sites and desktop tools. As a component of the Gaggle integration framework, Firegoose can also exchange data with Cytoscape, the R statistical package, Multiexperiment Viewer (MeV), and several other popular desktop software tools. Firegoose adds the capability to easily use local data to query KEGG, EMBL STRING, DAVID, and other widely-used bioinformatics web sites. Query results from these web sites can be transferred to desktop tools for further analysis with a few clicks.</p> <p>Firegoose acquires data from the web by screen scraping, microformats, embedded XML, or web services. We define a microformat, which allows structured information compatible with the Gaggle to be embedded in HTML documents.</p> <p>We demonstrate the capabilities of this software by performing an analysis of the genes activated in the microbe <it>Halobacterium salinarum NRC-1 </it>in response to anaerobic environments. Starting with microarray data, we explore functions of differentially expressed genes by combining data from several public web resources and construct an integrated view of the cellular processes involved.</p> <p>Conclusion</p> <p>The Firegoose incorporates Mozilla Firefox into the Gaggle environment and enables interactive sharing of data between diverse web resources and desktop software tools without maintaining local copies. Additional web sites can be incorporated easily into the framework using the scripting platform of the Firefox browser. Performing data integration in the browser allows the excellent search and navigation capabilities of the browser to be used in combination with powerful desktop tools.</p

    Integration and visualization of systems biology data in context of the genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High-density tiling arrays and new sequencing technologies are generating rapidly increasing volumes of transcriptome and protein-DNA interaction data. Visualization and exploration of this data is critical to understanding the regulatory logic encoded in the genome by which the cell dynamically affects its physiology and interacts with its environment.</p> <p>Results</p> <p>The Gaggle Genome Browser is a cross-platform desktop program for interactively visualizing high-throughput data in the context of the genome. Important features include dynamic panning and zooming, keyword search and open interoperability through the Gaggle framework. Users may bookmark locations on the genome with descriptive annotations and share these bookmarks with other users. The program handles large sets of user-generated data using an in-process database and leverages the facilities of SQL and the R environment for importing and manipulating data.</p> <p>A key aspect of the Gaggle Genome Browser is interoperability. By connecting to the Gaggle framework, the genome browser joins a suite of interconnected bioinformatics tools for analysis and visualization with connectivity to major public repositories of sequences, interactions and pathways. To this flexible environment for exploring and combining data, the Gaggle Genome Browser adds the ability to visualize diverse types of data in relation to its coordinates on the genome.</p> <p>Conclusions</p> <p>Genomic coordinates function as a common key by which disparate biological data types can be related to one another. In the Gaggle Genome Browser, heterogeneous data are joined by their location on the genome to create information-rich visualizations yielding insight into genome organization, transcription and its regulation and, ultimately, a better understanding of the mechanisms that enable the cell to dynamically respond to its environment.</p

    Prevalence of transcription promoters within archaeal operons and coding sequences

    Get PDF
    Despite the knowledge of complex prokaryotic-transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well-defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome-wide characterization of transcript structures of ∼64% of all genes, including putative non-coding RNAs in Halobacterium salinarum NRC-1. Our integrative analysis of transcriptome dynamics and protein–DNA interaction data sets showed widespread environment-dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3′ ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes—events usually considered spurious or non-functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements

    Niche adaptation by expansion and reprogramming of general transcription factors

    Get PDF
    Experimental analysis of TFB family proteins in a halophilic archaeon reveals complex environment-dependent fitness contributions. Gene conversion events among these proteins can generate novel niche adaptation capabilities, a process that may have contributed to archaeal adaptation to extreme environments

    Modeling the early stage of DNA sequence recognition within RecA nucleoprotein filaments

    Get PDF
    Homologous recombination is a fundamental process enabling the repair of double-strand breaks with a high degree of fidelity. In prokaryotes, it is carried out by RecA nucleofilaments formed on single-stranded DNA (ssDNA). These filaments incorporate genomic sequences that are homologous to the ssDNA and exchange the homologous strands. Due to the highly dynamic character of this process and its rapid propagation along the filament, the sequence recognition and strand exchange mechanism remains unknown at the structural level. The recently published structure of the RecA/DNA filament active for recombination (Chen et al., Mechanism of homologous recombination from the RecA-ssDNA/dsDNA structure, Nature 2008, 453, 489) provides a starting point for new exploration of the system. Here, we investigate the possible geometries of association of the early encounter complex between RecA/ssDNA filament and double-stranded DNA (dsDNA). Due to the huge size of the system and its dense packing, we use a reduced representation for protein and DNA together with state-of-the-art molecular modeling methods, including systematic docking and virtual reality simulations. The results indicate that it is possible for the double-stranded DNA to access the RecA-bound ssDNA while initially retaining its Watson–Crick pairing. They emphasize the importance of RecA L2 loop mobility for both recognition and strand exchange

    Effects of antiplatelet therapy on stroke risk by brain imaging features of intracerebral haemorrhage and cerebral small vessel diseases: subgroup analyses of the RESTART randomised, open-label trial

    Get PDF
    Background Findings from the RESTART trial suggest that starting antiplatelet therapy might reduce the risk of recurrent symptomatic intracerebral haemorrhage compared with avoiding antiplatelet therapy. Brain imaging features of intracerebral haemorrhage and cerebral small vessel diseases (such as cerebral microbleeds) are associated with greater risks of recurrent intracerebral haemorrhage. We did subgroup analyses of the RESTART trial to explore whether these brain imaging features modify the effects of antiplatelet therapy

    Scleroderma and related disorders: 223. Long Term Outcome in a Contemporary Systemic Sclerosis Cohort

    Get PDF
    Background: We have previously compared outcome in two groups of systemic sclerosis (SSc) patients with disease onset a decade apart and we reported data on 5 year survival and cumulative incidence of organ disease in a contemporary SSc cohort. The present study examines longer term outcome in an additional cohort of SSc followed for 10 years. Methods: We have examined patients with disease onset between years 1995 and 1999 allowing for at least 10 years of follow-up in a group that has characteristics representative for the patients we see in contemporary clinical practice. Results: Of the 398 patients included in the study, 252 (63.3%) had limited cutaneous (lc) SSc and 146 (36.7%) had diffuse cutaneous (dc) SSc. The proportion of male patients was higher among the dcSSc group (17.1% v 9.9%, p = 0.037) while the mean age of onset was significantly higher among lcSSc patients (50 ± 13 v 46 ± 13 years ± SD, p = 0.003). During a 10 year follow-up from disease onset, 45% of the dcSSc and 21% of the lcSSc subjects developed clinically significant pulmonary fibrosis, p < 0.001. Among them approximately half reached the endpoint within the first 3 years (23% of dcSSc and 10% of lcSSc) and over three quarters within the first 5 years (34% and 16% respectively). There was a similar incidence of pulmonary hypertension (PH) in the two subsets with a steady rate of increase over time. At 10 years 13% of dcSSc and 15% of lcSSc subjects had developed PH (p=0.558), with the earliest cases observed within the first 2 years of disease. Comparison between subjects who developed PH in the first and second 5 years from disease onset demonstrated no difference in demographic or clinical characteristics, but 5-year survival from PH onset was better among those who developed this complication later in their disease (49% v 24%), with a strong trend towards statistical significance (p = 0.058). Incidence of SSc renal crisis (SRC) was significantly higher among the dcSSc patients (12% v 4% in lcSSc, p = 0.002). As previously observed, the rate of development of SRC was highest in the first 3 years of disease- 10% in dcSSc and 3% in lcSSc. All incidences of clinically important cardiac disease developed in the first 5 years from disease onset (7% in dcSSc v 1% in lcSSc, p < 0.001) and remained unchanged at 10 years. As expected, 10-year survival among lcSSc subjects was significantly higher (81%) compared to that of dcSSc patients (70%, p = 0.006). Interestingly, although over the first 5 years the death rate was much higher in the dcSSc cohort (16% v 6% in lcSSc), over the following years it became very similar for both subsets (14% and 13% between years 5 and 10, and 18% and 17% between years 10 and 15 for dcSSc and lcSSc respectively). Conclusions: Even though dcSSc patients have higher incidence for most organ complications compared to lcSSc subjects, the worse survival among them is mainly due to higher early mortality rate. Mortality rate after first 5 years of disease becomes comparable in the two disease subsets. Disclosure statement: The authors have declared no conflicts of interes
    corecore