169,899 research outputs found

    A comprehensive functional map of the hepatitis C virus genome provides a resource for probing viral proteins.

    Get PDF
    UnlabelledPairing high-throughput sequencing technologies with high-throughput mutagenesis enables genome-wide investigations of pathogenic organisms. Knowledge of the specific functions of protein domains encoded by the genome of the hepatitis C virus (HCV), a major human pathogen that contributes to liver disease worldwide, remains limited to insight from small-scale studies. To enhance the capabilities of HCV researchers, we have obtained a high-resolution functional map of the entire viral genome by combining transposon-based insertional mutagenesis with next-generation sequencing. We generated a library of 8,398 mutagenized HCV clones, each containing one 15-nucleotide sequence inserted at a unique genomic position. We passaged this library in hepatic cells, recovered virus pools, and simultaneously assayed the abundance of mutant viruses in each pool by next-generation sequencing. To illustrate the validity of the functional profile, we compared the genetic footprints of viral proteins with previously solved protein structures. Moreover, we show the utility of these genetic footprints in the identification of candidate regions for epitope tag insertion. In a second application, we screened the genetic footprints for phenotypes that reflected defects in later steps of the viral life cycle. We confirmed that viruses with insertions in a region of the nonstructural protein NS4B had a defect in infectivity while maintaining genome replication. Overall, our genome-wide HCV mutant library and the genetic footprints obtained by high-resolution profiling represent valuable new resources for the research community that can direct the attention of investigators toward unidentified roles of individual protein domains.ImportanceOur insertional mutagenesis library provides a resource that illustrates the effects of relatively small insertions on local protein structure and HCV viability. We have also generated complementary resources, including a website (http://hangfei.bol.ucla.edu) and a panel of epitope-tagged mutant viruses that should enhance the research capabilities of investigators studying HCV. Researchers can now detect epitope-tagged viral proteins by established antibodies, which will allow biochemical studies of HCV proteins for which antibodies are not readily available. Furthermore, researchers can now quickly look up genotype-phenotype relationships and base further mechanistic studies on the residue-by-residue information from the functional profile. More broadly, this approach offers a general strategy for the systematic functional characterization of viruses on the genome scale

    OREST: the online resource for EST analysis

    Get PDF
    The generation of expressed sequence tag (EST) libraries offers an affordable approach to investigate organisms, if no genome sequence is available. OREST (http://mips.gsf.de/genre/proj/orest/index.html) is a server-based EST analysis pipeline, which allows the rapid analysis of large amounts of ESTs or cDNAs from mammalia and fungi. In order to assign the ESTs to genes or proteins OREST maps DNA sequences to reference datasets of gene products and in a second step to complete genome sequences. Mapping against genome sequences recovers additional 13% of EST data, which otherwise would escape further analysis. To enable functional analysis of the datasets, ESTs are functionally annotated using the hierarchical FunCat annotation scheme as well as GO annotation terms. OREST also allows to predict the association of gene products and diseases by Morbid Map (OMIM) classification. A statistical analysis of the results of the dataset is possible with the included PROMPT software, which provides information about enrichment and depletion of functional and disease annotation terms. OREST was successfully applied for the identification and functional characterization of more than 3000 EST sequences of the common marmoset monkey (Callithrix jacchus) as part of an international collaboration

    DCU 250 Arabic dependency bank: an LFG gold standard resource for the Arabic Penn treebank

    Get PDF
    This paper describes the construction of a dependency bank gold standard for Arabic, DCU 250 Arabic Dependency Bank (DCU 250), based on the Arabic Penn Treebank Corpus (ATB) (Bies and Maamouri, 2003; Maamouri and Bies, 2004) within the theoretical framework of Lexical Functional Grammar (LFG). For parsing and automatically extracting grammatical and lexical resources from treebanks, it is necessary to evaluate against established gold standard resources. Gold standards for various languages have been developed, but to our knowledge, such a resource has not yet been constructed for Arabic. The construction of the DCU 250 marks the first step towards the creation of an automatic LFG f-structure annotation algorithm for the ATB, and for the extraction of Arabic grammatical and lexical resources

    Morphological annotation of Korean with Directly Maintainable Resources

    Get PDF
    This article describes an exclusively resource-based method of morphological annotation of written Korean text. Korean is an agglutinative language. Our annotator is designed to process text before the operation of a syntactic parser. In its present state, it annotates one-stem words only. The output is a graph of morphemes annotated with accurate linguistic information. The granularity of the tagset is 3 to 5 times higher than usual tagsets. A comparison with a reference annotated corpus showed that it achieves 89% recall without any corpus training. The language resources used by the system are lexicons of stems, transducers of suffixes and transducers of generation of allomorphs. All can be easily updated, which allows users to control the evolution of the performances of the system. It has been claimed that morphological annotation of Korean text could only be performed by a morphological analysis module accessing a lexicon of morphemes. We show that it can also be performed directly with a lexicon of words and without applying morphological rules at annotation time, which speeds up annotation to 1,210 word/s. The lexicon of words is obtained from the maintainable language resources through a fully automated compilation process

    Automatic treebank-based acquisition of Arabic LFG dependency structures

    Get PDF
    A number of papers have reported on methods for the automatic acquisition of large-scale, probabilistic LFG-based grammatical resources from treebanks for English (Cahill and al., 2002), (Cahill and al., 2004), German (Cahill and al., 2003), Chinese (Burke, 2004), (Guo and al., 2007), Spanish (O’Donovan, 2004), (Chrupala and van Genabith, 2006) and French (Schluter and van Genabith, 2008). Here, we extend the LFG grammar acquisition approach to Arabic and the Penn Arabic Treebank (ATB) (Maamouri and Bies, 2004), adapting and extending the methodology of (Cahill and al., 2004) originally developed for English. Arabic is challenging because of its morphological richness and syntactic complexity. Currently 98% of ATB trees (without FRAG and X) produce a covering and connected f-structure. We conduct a qualitative evaluation of our annotation against a gold standard and achieve an f-score of 95%

    A Domain Analysis to Specify Design Defects and Generate Detection Algorithms

    Get PDF
    Quality experts often need to identify in software systems design defects, which are recurring design problems, that hinder development\ud and maintenance. Consequently, several defect detection approaches\ud and tools have been proposed in the literature. However, we are not\ud aware of any approach that defines and reifies the process of generating\ud detection algorithms from the existing textual descriptions of defects.\ud In this paper, we introduce an approach to automate the generation\ud of detection algorithms from specifications written using a domain-specific\ud language. The domain-specific is defined from a thorough domain analysis.\ud We specify several design defects, generate automatically detection\ud algorithms using templates, and validate the generated detection\ud algorithms in terms of precision and recall on Xerces v2.7.0, an\ud open-source object-oriented system

    Antibody fragments as probe in biosensor development

    Get PDF
    Today's proteomic analyses are generating increasing numbers of biomarkers, making it essential to possess highly specific probes able to recognize those targets. Antibodies are considered to be the first choice as molecular recognition units due to their target specificity and affinity, which make them excellent probes in biosensor development. However several problems such as difficult directional immobilization, unstable behavior, loss of specificity and steric hindrance, may arise from using these large molecules. Luckily, protein engineering techniques offer designed antibody formats suitable for biomarker analysis. Minimization strategies of antibodies into Fab fragments, scFv or even single-domain antibody fragments like VH, VL or VHHs are reviewed. Not only the size of the probe but also other issues like choice of immobilization tag, type of solid support and probe stability are of critical importance in assay development for biosensing. In this respect, multiple approaches to specifically orient and couple antibody fragments in a generic one-step procedure directly on a biosensor substrate are discussed
    • 

    corecore