212 research outputs found

    preAssemble: a tool for automatic sequencer trace data processing

    Get PDF
    BACKGROUND: Trace or chromatogram files (raw data) are produced by automatic nucleic acid sequencing equipment or sequencers. Each file contains information which can be interpreted by specialised software to reveal the sequence (base calling). This is done by the sequencer proprietary software or publicly available programs. Depending on the size of a sequencing project the number of trace files can vary from just a few to thousands of files. Sequencing quality assessment on various criteria is important at the stage preceding clustering and contig assembly. Two major publicly available packages – Phred and Staden are used by preAssemble to perform sequence quality processing. RESULTS: The preAssemble pre-assembly sequence processing pipeline has been developed for small to large scale automatic processing of DNA sequencer chromatogram (trace) data. The Staden Package Pregap4 module and base-calling program Phred are utilized in the pipeline, which produces detailed and self-explanatory output that can be displayed with a web browser. preAssemble can be used successfully with very little previous experience, however options for parameter tuning are provided for advanced users. preAssemble runs under UNIX and LINUX operating systems. It is available for downloading and will run as stand-alone software. It can also be accessed on the Norwegian Salmon Genome Project web site where preAssemble jobs can be run on the project server. CONCLUSION: preAssemble is a tool allowing to perform quality assessment of sequences generated by automatic sequencing equipment. preAssemble is flexible since both interactive jobs on the preAssemble server and the stand alone downloadable version are available. Virtually no previous experience is necessary to run a default preAssemble job, on the other hand options for parameter tuning are provided. Consequently preAssemble can be used as efficiently for just several trace files as for large scale sequence processing

    An EST-based approach for identifying genes expressed in the intestine and gills of pre-smolt Atlantic salmon (Salmo salar)

    Get PDF
    BACKGROUND: The Atlantic salmon is an important aquaculture species and a very interesting species biologically, since it spawns in fresh water and develops through several stages before becoming a smolt, the stage at which it migrates to the sea to feed. The dramatic change of habitat requires physiological, morphological and behavioural changes to prepare the salmon for its new environment. These changes are called the parr-smolt transformation or smoltification, and pre-adapt the salmon for survival and growth in the marine environment. The development of hypo-osmotic regulatory ability plays an important part in facilitating the transition from rivers to the sea. The physiological mechanisms behind the developmental changes are largely unknown. An understanding of the transformation process will be vital to the future of the aquaculture industry. A knowledge of which genes are expressed prior to the smoltification process is an important basis for further studies. RESULTS: In all, 2974 unique sequences, consisting of 779 contigs and 2195 singlets, were generated for Atlantic salmon from two cDNA libraries constructed from the gills and the intestine, accession numbers [Genbank: CK877169-CK879929, CK884015-CK886537 and CN181112-CN181464]. Nearly 50% of the sequences were assigned putative functions because they showed similarity to known genes, mostly from other species, in one or more of the databases used. The Swiss-Prot database returned significant hits for 1005 sequences. These could be assigned predicted gene products, and 967 were annotated using Gene Ontology (GO) terms for molecular function, biological process and/or cellular component, employing an annotation transfer procedure. CONCLUSION: This paper describes the construction of two cDNA libraries from pre-smolt Atlantic salmon (Salmo salar) and the subsequent EST sequencing, clustering and assigning of putative function to 1005 genes expressed in the gills and/or intestine

    PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology

    Get PDF
    PARALIGN is a rapid and sensitive similarity search tool for the identification of distantly related sequences in both nucleotide and amino acid sequence databases. Two algorithms are implemented, accelerated Smith–Waterman and ParAlign. The ParAlign algorithm is similar to Smith–Waterman in sensitivity, while as quick as BLAST for protein searches. A form of parallel computing technology known as multimedia technology that is available in modern processors, but rarely used by other bioinformatics software, has been exploited to achieve the high speed. The software is also designed to run efficiently on computer clusters using the message-passing interface standard. A public search service powered by a large computer cluster has been set-up and is freely available at , where the major public databases can be searched. The software can also be downloaded free of charge for academic use

    Parity-violating macroscopic force between chiral molecules and source mass

    Full text link
    A theory concerning non-zero macroscopic chirality-dependent force between a source mass and homochiral molecules due to the exchange of light particles is presented in this paper. This force is proposed to have opposite sign for molecules with opposite chirality. Using the central field approximation, we calculate this force between a copper block and a vessel of chiral molecules (methyl phenyl carbinol nitrite). The magnitude of force is estimated with the published limits of the scalar and pseudo-scalar coupling constants. Based on our theoretical model, this force may violate the equivalence principle when the homochiral molecules are used to be the test masses.Comment: 10 pages, 1 figur

    Two-step method for precise calculation of core properties in molecules

    Full text link
    Precise calculations of core properties in heavy-atom systems which are described by the operators heavily concentrated in atomic cores, like to hyperfine structure and P,T-parity nonconservation effects, usually require accounting for relativistic effects. Unfortunately, completely relativistic treatment of molecules containing heavy elements is very consuming already at the stages of calculation and transformation of two-electron integrals with a basis set of four-component spinors. In turn, the relativistic effective core potential (RECP) calculations of valence (spectroscopic, chemical etc.) properties of molecules are very popular because the RECP method allows one to treat quite satisfactory the correlation and relativistic effects for the valence electrons of a molecule and to reduce significantly the computational efforts. The valence molecular spinors are usually smoothed in atomic cores and, as a result, direct calculation of electronic densities near heavy nuclei is impossible. In the paper, the methods of nonvariational and variational one-center restoration of correct shapes of four-component spinors in atomic cores after a two-component RECP calculation of a molecule are discussed. Their efficiency is illustrated in correlation calculations of hyperfine structure and parity nonconservation effects in heavy-atom molecules YbF, BaF, TlF, and PbO.Comment: 20 pages, 3 tables, lecture on the Fock school-conference (Novgorod-the-Great, Russia, April 2004

    Structural insight into repair of alkylated DNA by a new superfamily of DNA glycosylases comprising HEAT-like repeats

    Get PDF
    3-methyladenine DNA glycosylases initiate repair of cytotoxic and promutagenic alkylated bases in DNA. We demonstrate by comparative modelling that Bacillus cereus AlkD belongs to a new, fifth, structural superfamily of DNA glycosylases with an alpha–alpha superhelix fold comprising six HEAT-like repeats. The structure reveals a wide, positively charged groove, including a putative base recognition pocket. This groove appears to be suitable for the accommodation of double-stranded DNA with a flipped-out alkylated base. Site-specific mutagenesis within the recognition pocket identified several residues essential for enzyme activity. The results suggest that the aromatic side chain of a tryptophan residue recognizes electron-deficient alkylated bases through stacking interactions, while an interacting aspartate–arginine pair is essential for removal of the damaged base. A structural model of AlkD bound to DNA with a flipped-out purine moiety gives insight into the catalytic machinery for this new class of DNA glycosylases

    The DIRAC code for relativistic molecular calculations

    Get PDF
    DIRAC is a freely distributed general-purpose program system for one-, two-, and four-component relativistic molecular calculations at the level of Hartree?Fock, Kohn?Sham (including range-separated theory), multiconfigurational self-consistent-field, multireference configuration interaction, electron propagator, and various flavors of coupled cluster theory. At the self-consistent-field level, a highly original scheme, based on quaternion algebra, is implemented for the treatment of both spatial and time reversal symmetry. DIRAC features a very general module for the calculation of molecular properties that to a large extent may be defined by the user and further analyzed through a powerful visualization module. It allows for the inclusion of environmental effects through three different classes of increasingly sophisticated embedding approaches: the implicit solvation polarizable continuum model, the explicit polarizable embedding model, and the frozen density embedding model.Fil: Saue, Trond. Université Paul Sabatier; Francia. Centre National de la Recherche Scientifique; FranciaFil: Bast, Radovan. Uit The Arctic University Of Norway; NoruegaFil: Gomes, André Severo Pereira. University Of Lille.; Francia. Centre National de la Recherche Scientifique; FranciaFil: Jensen, Hans Jorgen Aa.. University of Southern Denmark; DinamarcaFil: Visscher, Lucas. Vrije Universiteit Amsterdam; Países BajosFil: Aucar, Ignacio Agustín. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Modelado e Innovación Tecnológica. Universidad Nacional del Nordeste. Facultad de Ciencias Exactas Naturales y Agrimensura. Instituto de Modelado e Innovación Tecnológica; Argentina. Universidad Nacional del Nordeste. Facultad de Ciencias Exactas y Naturales y Agrimensura. Departamento de Física; ArgentinaFil: Di Remigio, Roberto. Uit The Arctic University of Norway; NoruegaFil: Dyall, Kenneth G.. Dirac Solutions; Estados UnidosFil: Eliav, Ephraim. Universitat Tel Aviv.; IsraelFil: Fasshauer, Elke. Aarhus University. Department of Bioscience; DinamarcaFil: Fleig, Timo. Université Paul Sabatier; Francia. Centre National de la Recherche Scientifique; FranciaFil: Halbert, Loïc. Centre National de la Recherche Scientifique; Francia. University Of Lille.; FranciaFil: Hedegård, Erik Donovan. Lund University; SueciaFil: Helmich-Paris, Benjamin. Max-planck-institut Für Kohlenforschung; AlemaniaFil: Ilias, Miroslav. Matej Bel University; EslovaquiaFil: Jacob, Christoph R.. Technische Universität Braunschweig; AlemaniaFil: Knecht, Stefan. Eth Zürich, Laboratorium Für Physikalische Chemie; SuizaFil: Laerdahl, Jon K.. Oslo University Hospital; NoruegaFil: Vidal, Marta L.. Department Of Chemistry; DinamarcaFil: Nayak, Malaya K.. Bhabha Atomic Research Centre; IndiaFil: Olejniczak, Malgorzata. University Of Warsaw; PoloniaFil: Olsen, Jógvan Magnus Haugaard. Uit The Arctic University Of Norway; NoruegaFil: Pernpointner, Markus. Kybeidos Gmbh; AlemaniaFil: Senjean, Bruno. Universiteit Leiden; Países BajosFil: Shee, Avijit. Department Of Chemistry; Estados UnidosFil: Sunaga, Ayaki. Tokyo Metropolitan University; JapónFil: van Stralen, Joost N. P.. Vrije Universiteit Amsterdam; Países Bajo

    Extended analysis of a genome-wide association study in primary sclerosing cholangitis detects multiple novel risk loci.

    Get PDF
    A limited number of genetic risk factors have been reported in primary sclerosing cholangitis (PSC). To discover further genetic susceptibility factors for PSC, we followed up on a second tier of single nucleotide polymorphisms (SNPs) from a genome-wide association study (GWAS). We analyzed 45 SNPs in 1221 PSC cases and 3508 controls. The association results from the replication analysis and the original GWAS (715 PSC cases and 2962 controls) were combined in a meta-analysis comprising 1936 PSC cases and 6470 controls. We performed an analysis of bile microbial community composition in 39 PSC patients by 16S rRNA sequencing. Seventeen SNPs representing 12 distinct genetic loci achieved nominal significance (p(replication) <0.05) in the replication. The most robust novel association was detected at chromosome 1p36 (rs3748816; p(combined)=2.1 × 10(-8)) where the MMEL1 and TNFRSF14 genes represent potential disease genes. Eight additional novel loci showed suggestive evidence of association (p(repl) <0.05). FUT2 at chromosome 19q13 (rs602662; p(comb)=1.9 × 10(-6), rs281377; p(comb)=2.1 × 10(-6) and rs601338; p(comb)=2.7 × 10(-6)) is notable due to its implication in altered susceptibility to infectious agents. We found that FUT2 secretor status and genotype defined by rs601338 significantly influence biliary microbial community composition in PSC patients. We identify multiple new PSC risk loci by extended analysis of a PSC GWAS. FUT2 genotype needs to be taken into account when assessing the influence of microbiota on biliary pathology in PSC.Norwegian PSC Research Center German Ministry of Education and Research (BMBF) through the National Genome Research Network (NGFN) Integrated Research and Treatment Center - Transplantation 01EO0802 PopGen biobank NIH DK 8496

    Mutational Characterization of the Bile Acid Receptor TGR5 in Primary Sclerosing Cholangitis

    Get PDF
    TGR5, the G protein-coupled bile acid receptor 1 (GPBAR1), has been linked to inflammatory pathways as well as bile homeostasis, and could therefore be involved in primary sclerosing cholangitis (PSC) a chronic inflammatory bile duct disease. We aimed to extensively investigate TGR5 sequence variation in PSC, as well as functionally characterize detected variants. Complete resequencing of TGR5 was performed in 267 PSC patients and 274 healthy controls. Six nonsynonymous mutations were identified in addition to 16 other novel single-nucleotide polymorphisms. To investigate the impact from the nonsynonymous variants on TGR5, we created a receptor model, and introduced mutated TGR5 constructs into human epithelial cell lines. By using confocal microscopy, flow cytometry and a cAMP-sensitive luciferase assay, five of the nonsynonymous mutations (W83R, V178M, A217P, S272G and Q296X) were found to reduce or abolish TGR5 function. Fine-mapping of the previously reported PSC and UC associated locus at chromosome 2q35 in large patient panels revealed an overall association between the TGR5 single-nucleotide polymorphism rs11554825 and PSC (odds ratio = 1.14, 95% confidence interval: 1.03-1.26, p = 0.010) and UC (odds ratio = 1.19, 95% confidence interval 1.11-1.27, p = 8.5 x 10(-7)), but strong linkage disequilibrium precluded demarcation of TGR5 from neighboring genes. Resequencing of TGR5 along with functional investigations of novel variants provided unique insight into an important candidate gene for several inflammatory and metabolic conditions. While significant TGR5 associations were detected in both UC and PSC, further studies are needed to conclusively define the role of TGR5 variation in these diseases
    corecore