293 research outputs found
CoVDB: a comprehensive database for comparative analysis of coronavirus genes and genomes
The recent SARS epidemic has boosted interest in the discovery of novel human and animal coronaviruses. By July 2007, more than 3000 coronavirus sequence records, including 264 complete genomes, are available in GenBank. The number of coronavirus species with complete genomes available has increased from 9 in 2003 to 25 in 2007, of which six, including coronavirus HKU1, bat SARS coronavirus, group 1 bat coronavirus HKU2, groups 2c and 2d coronaviruses, were sequenced by our laboratory. To overcome the problems we encountered in the existing databases during comparative sequence analysis, we built a comprehensive database, CoVDB (http://covdb.microbiology.hku.hk), of annotated coronavirus genes and genomes. CoVDB provides a convenient platform for rapid and accurate batch sequence retrieval, the cornerstone and bottleneck for comparative gene or genome analysis. Sequences can be directly downloaded from the website in FASTA format. CoVDB also provides detailed annotation of all coronavirus sequences using a standardized nomenclature system, and overcomes the problems of duplicated and identical sequences in other databases. For complete genomes, a single representative sequence for each species is available for comparative analysis such as phylogenetic studies. With the annotated sequences in CoVDB, more specific blast search results can be generated for efficient downstream analysis
Coronavirus Genomics and Bioinformatics Analysis
The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb) among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid) and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999–2002, with estimated substitution rate of 4×10−4 to 2×10−2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV), between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV) type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1). Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses
PacBio But Not Illumina Technology Can Achieve Fast, Accurate and Complete Closure of the High GC, Complex Burkholderia pseudomallei Two-Chromosome Genome
Although PacBio third-generation sequencers have improved the read lengths of genome sequencing which facilitates the assembly of complete genomes, no study has reported success in using PacBio data alone to completely sequence a two-chromosome bacterial genome from a single library in a single run. Previous studies using earlier versions of sequencing chemistries have at most been able to finish bacterial genomes containing only one chromosome with de novo assembly. In this study, we compared the robustness of PacBio RS II, using one SMRT cell and the latest P6-C4 chemistry, with Illumina HiSeq 1500 in sequencing the genome of Burkholderia pseudomallei, a bacterium which contains two large circular chromosomes, very high G+C content of 68–69%, highly repetitive regions and substantial genomic diversity, and represents one of the largest and most complex bacterial genomes sequenced, using a reference genome generated by hybrid assembly using PacBio and Illumina datasets with subsequent manual validation. Results showed that PacBio data with de novo assembly, but not Illumina, was able to completely sequence the B. pseudomallei genome without any gaps or mis-assemblies. The two large contigs of the PacBio assembly aligned unambiguously to the reference genome, sharing >99.9% nucleotide identities. Conversely, Illumina data assembled using three different assemblers resulted in fragmented assemblies (201–366 contigs), sharing only 92.2–100% and 92.0–100% nucleotide identities to chromosomes I and II reference sequences, respectively, with no indication that the B. pseudomallei genome consisted of two chromosomes with four copies of ribosomal operons. Among all assemblies, the PacBio assembly recovered the highest number of core and virulence proteins, and housekeeping genes based on whole-genome multilocus sequence typing (wgMLST). Most notably, assembly solely based on PacBio outperformed even hybrid assembly using both PacBio and Illumina datasets. Hybrid approach generated only 74 contigs, while the PacBio data alone with de novo assembly achieved complete closure of the two-chromosome B. pseudomallei genome without additional costly bench work and further sequencing. PacBio RS II using P6-C4 chemistry is highly robust and cost-effective and should be the platform of choice in sequencing bacterial genomes, particularly for those that are well-known to be difficult-to-sequence
Modeling of Exoplanet Atmospheres
Spectrally characterizing exoplanet atmospheres will be one of the fastest moving astronomical disciplines in the years to come. Especially the upcoming James Webb Space Telescope
(JWST) will provide spectral measurements from the near- to mid-infrared of unprecedented precision. With other next generation instruments on the horizon, it is crucial to possess the tools necessary for interpretating observations. To this end I wrote the petitCODE, which solves for the self-consistent atmospheric structures of exoplanets, assuming chemical and radiative-convective equilibrium. The code includes scattering, and models clouds. The code outputs the planet’s observable emission and transmission spectra. In addition, I constructed a spectral retrieval code, which derives the full posterior probability distribution of atmospheric parameters from observations. I used petitCODE to systematically study the atmospheres of hot jupiters and found, e.g., that their structures depend strongly on the type of their host stars. Moreover, I found that C/O ratios around unity can lead to atmospheric inversions. Next, I produced synthetic observations of prime exoplanet targets for JWST, and studied how well we will be able to distinguish various atmospheric scenarios. Finally, I verified the implementation of my retrieval code using mock JWST observations
Unraveling the Molecular Basis of Temperature-Dependent Genetic Regulation in Penicillium marneffei
Penicillium marneffei is an opportunistic fungal pathogen endemic in Southeast Asia, causing lethal systemic infections in immunocompromised patients. P. marneffei grows in a mycelial form at the ambient temperature of 25°C and transitions to a yeast form at 37°C. The ability to alternate between the mycelial and yeast forms at different temperatures, namely, thermal dimorphism, has long been considered critical for the pathogenicity of P. marneffei, yet the underlying genetic mechanisms remain elusive. Here we employed high-throughput sequencing to unravel global transcriptional profiles of P. marneffei PM1 grown at 25 and 37°C. Among ∼11,000 protein-coding genes, 1,447 were overexpressed and 1,414 were underexpressed at 37°C. Counterintuitively, heat-responsive genes, predicted in P. marneffei through sequence comparison, did not tend to be overexpressed at 37°C. These results suggest that P. marneffei may take a distinct strategy of genetic regulation at the elevated temperature; the current knowledge concerning fungal heat response, based on studies of model fungal organisms, may not be applicable to P. marneffei. Our results further showed that the tandem repeat sequences (TRSs) are overrepresented in coding regions of P. marneffei genes, and TRS-containing genes tend to be overexpressed at 37°C. Furthermore, genomic sequences and expression data were integrated to characterize gene clusters, multigene families, and species-specific genes of P. marneffei. In sum, we present an integrated analysis and a comprehensive resource toward a better understanding of temperature-dependent genetic regulation in P. marneffei
Discovery and Genomic Characterization of a Novel Ovine Partetravirus and a New Genotype of Bovine Partetravirus
Partetravirus is a recently described group of animal parvoviruses which include the human partetravirus, bovine partetravirus and porcine partetravirus (previously known as human parvovirus 4, bovine hokovirus and porcine hokovirus respectively). In this report, we describe the discovery and genomic characterization of partetraviruses in bovine and ovine samples from China. These partetraviruses were detected by PCR in 1.8% of bovine liver samples, 66.7% of ovine liver samples and 71.4% of ovine spleen samples. One of the bovine partetraviruses detected in the present samples is phylogenetically distinct from previously reported bovine partetraviruses and likely represents a novel genotype. The ovine partetravirus is a novel partetravirus and phylogenetically most related to the bovine partetraviruses. The genome organization is conserved amongst these viruses, including the presence of a putative transmembrane protein encoded by an overlapping reading frame in ORF2. Results from the present study provide further support to the classification of partetraviruses as a separate genus in Parvovirinae
Immunoassays Based on Penicillium marneffei Mp1p Derived from Pichia pastoris Expression System for Diagnosis of Penicilliosis
BACKGROUND: Penicillium marneffei is a dimorphic fungus endemic in Southeast Asia. It can cause fatal penicilliosis in humans, particularly in HIV-infected people. Diagnosis of this infection is difficult because its clinical manifestations are not distinctive. Specialized laboratory tests are necessary to establish a definitive diagnosis for successful management. We have demonstrated previously that a cell wall mannoprotein Mp1p, abundant in P. marneffei, is a potential biomarker for diagnosis of P. marneffei infections. In the present study, we describe immunoassays based on Mp1p derived from the yeast Pichia pastoris expression system. METHODOLOGY/PRINCIPAL FINDINGS: We generated monoclonal antibodies (MAbs) and rabbit polyclonal antibodies (PAbs) against Mp1p expressed in P. pastoris. Subsequently, we developed two Mp1p antigen capture ELISAs which employed MAbs for both the capture and detecting antibodies (MAb-MAb pair) or PAbs and MAbs as the capture and detecting antibodies (PAbs-MAb pair) respectively. The two Mp1p antigen ELISAs detected Mp1p specifically in cultures of P. marneffei yeast phase at 37-40 degrees C and had no cross-reaction with other tested pathogenic fungi. The sensitivities and specificities of the two antigen assays were found to be 55% (11/20) and 99.6% (538/540) for MAb-MAb Mp1p ELISA, and 75% (15/20) and 99.4% (537/540) for PAbs-MAb Mp1p ELISA performed using 20 sera with culture-confirmed penicilliosis, and 540 control sera from 15 other mycosis patients and 525 healthy donors. Meanwhile, we also developed an anti-Mp1p IgG antibody ELISA with an evaluated sensitivity of 30% (6/20) and a specificity of 98.5% (532/540) using the same sera. Furthermore, combining the results of Mp1p antigen and antibody detection improved the sensitivity of diagnosis to 100% (20/20). CONCLUSIONS/SIGNIFICANCE: Simultaneous detection of antigen and antibody using the immunoassays based on Mp1p derived from P. pastoris greatly improves detection sensitivity. The procedures should be useful for the routine diagnosis of penicilliosis.published_or_final_versio
Integrating Communities of Practice in Technology Development Projects
Technology development projects usually benefit when knowledge and expertise are drawn from a variety of sources, including potential users. Orchestrating the involvement of people from disparate groups is a crucial task for project managers. It requires finding a balance between differentiation, when teams work in isolation, and integration, when groups come together to exchange knowledge. This article argues that a “community of practice” perspective can help project managers to achieve this balance, by drawing attention to the assumptions, interests, skills, and formal and tacit knowledge of the different groups involved. Successful integration can be achieved by ensuring that the developing technology is comprehensible to all the groups concerned, and making sure that it satisfies their various interests
Mutational processes molding the genomes of 21 breast cancers
All cancers carry somatic mutations. The patterns of mutation in cancer genomes reflect the DNA damage and repair processes to which cancer cells and their precursors have been exposed. To explore these mechanisms further, we generated catalogs of somatic mutation from 21 breast cancers and applied mathematical methods to extract mutational signatures of the underlying processes. Multiple distinct single- and double-nucleotide substitution signatures were discernible. Cancers with BRCA1 or BRCA2 mutations exhibited a characteristic combination of substitution mutation signatures and a distinctive profile of deletions. Complex relationships between somatic mutation prevalence and transcription were detected. A remarkable phenomenon of localized hypermutation, termed "kataegis," was observed. Regions of kataegis differed between cancers but usually colocalized with somatic rearrangements. Base substitutions in these regions were almost exclusively of cytosine at TpC dinucleotides. The mechanisms underlying most of these mutational signatures are unknown. However, a role for the APOBEC family of cytidine deaminases is proposed
- …