26,021 research outputs found

    Semantic Integration of Cervical Cancer Data Repositories to Facilitate Multicenter Association Studies: The ASSIST Approach

    Get PDF
    The current work addresses the unifi cation of Electronic Health Records related to cervical cancer into a single medical knowledge source, in the context of the EU-funded ASSIST research project. The project aims to facilitate the research for cervical precancer and cancer through a system that virtually unifi es multiple patient record repositories, physically located in different medical centers/hospitals, thus, increasing fl exibility by allowing the formation of study groups “on demand” and by recycling patient records in new studies. To this end, ASSIST uses semantic technologies to translate all medical entities (such as patient examination results, history, habits, genetic profi le) and represent them in a common form, encoded in the ASSIST Cervical Cancer Ontology. The current paper presents the knowledge elicitation approach followed, towards the defi nition and representation of the disease’s medical concepts and rules that constitute the basis for the ASSIST Cervical Cancer Ontology. The proposed approach constitutes a paradigm for semantic integration of heterogeneous clinical data that may be applicable to other biomedical application domains

    The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

    Get PDF
    Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets

    De novo transcriptome sequencing and SSR markers development for Cedrela balansae C. DC., a native tree species of northwest Argentina

    Get PDF
    The endangered Cedrela balansae C.DC. (Meliaceae) is a high-value timber species with great potential for forest plantations that inhabits the tropical forests in Northwestern Argentina. Research on this species is scarce because of the limited genetic and genomic information available. Here, we explored the transcriptome of C. balansae using 454 GS FLX Titanium next-generation sequencing (NGS) technology. Following de novo assembling, we identified 27,111 non-redundant unigenes longer than 200 bp, and considered these transcripts for further downstream analysis. The functional annotation was performed searching the 27,111 unigenes against the NR-Protein and the Interproscan databases. This analysis revealed 26,977 genes with homology in at least one of the Database analyzed. Furthermore, 7,774 unigenes in 142 different active biological pathways in C. balansae were identified with the KEGG database. Moreover, after in silico analyses, we detected 2,663 simple sequence repeats (SSRs) markers. A subset of 70 SSRs related to important “stress tolerance” traits based on functional annotation evidence, were selected for wet PCR-validation in C. balansae and other Cedrela species inhabiting in northwest and northeast of Argentina (C. fissilis, C. saltensis and C. angustifolia). Successful transferability was between 77% and 93% and thanks to this study, 32 polymorphic functional SSRs for all analyzed Cedrela species are now available. The gene catalog and molecular markers obtained here represent a starting point for further research, which will assist genetic breeding programs in the Cedrela genus and will contribute to identifying key populations for its preservation.Fil: Torales, Susana. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación de Recursos Naturales. Instituto de Recursos Biológicos; ArgentinaFil: Rivarola, Maximo Lisandro. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; ArgentinaFil: Gonzalez, Sergio. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; ArgentinaFil: Inza, María Virginia. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación de Recursos Naturales. Instituto de Recursos Biológicos; ArgentinaFil: Pomponio, María Florencia. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación de Recursos Naturales. Instituto de Recursos Biológicos; ArgentinaFil: Fernández, Paula. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; ArgentinaFil: Acuña, Cintia Vanesa. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Zelener, Noga. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación de Recursos Naturales. Instituto de Recursos Biológicos; ArgentinaFil: Fornes, Luis Fernando. Instituto Nacional de Tecnología Agropecuaria. Centro Regional Tucuman-Santiago del Estero; ArgentinaFil: Hopp, Horacio Esteban. Universidad de Belgrano. Facultad de Ciencias Exactas y Naturales; Argentina. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; ArgentinaFil: Paniego, Norma Beatriz. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Marcucci Poltri, Susana Noemí. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; Argentin

    Does the catechol-O-methyltransferase (COMT) Val158Met human polymorphism in influence procrastination?

    Get PDF
    Genetic studies are enlightening how the expression of several genes influences neuronal activity and all facets of human normal and abnormal behaviour. Among these, a growing body of information shows that a few key genes regulating activity of central neurotransmitters have specific roles in cognitive and/or emotional processes, as ‘procrastination’. We investigated the association of the 5-HTTLPR and COMT Val158Met polymorphisms with students’ procrastination in an academic writing task. Results: showed no relationship between procrastination and the 5-HTT polymorphism but they revealed an association with the COMT Val158Met one. Particularly, the presence of the Met158 allele was found to be significantly associated with the tendency to initiate and complete the assigned task. We hypothesize that the role of central monoamines and of dopamine already identified in impulsive behaviour, extends to procrastination. Since the 158Met allele provides neurons with significantly higher basal dopamine levels when compared to the 158Val allele, our observation suggests that under normal conditions the 158Met allele provides carriers with increased inhibitory control, resulting in an increased tendency to adhere to a planned schedule and therefore reducing procrastination. On the other hand, the Val158 allele may result more effective in increasing carriers’ performances under stress conditions, namely when the schedule deadline is approaching, and dopamine release is increased. This would result in a higher tendency to procrastinate. This hypothesis can readily be tested by applying the experimental approach here employed to various samples of subjects belonging to different categories and extending the analysis to other putative neuron-expressed gene

    Genome-wide association study of male sexual orientation

    Get PDF

    Diversity, genetic mapping, and signatures of domestication in the carrot (Daucus carota L.) genome, as revealed by Diversity Arrays Technology (DArT) markers

    Get PDF
    Carrot is one of the most economically important vegetables worldwide, but genetic and genomic resources supporting carrot breeding remain limited. We developed a Diversity Arrays Technology (DArT) platform for wild and cultivated carrot and used it to investigate genetic diversity and to develop a saturated genetic linkage map of carrot. We analyzed a set of 900 DArT markers in a collection of plant materials comprising 94 cultivated and 65 wild carrot accessions. The accessions were attributed to three separate groups: wild, Eastern cultivated and Western cultivated. Twenty-seven markers showing signatures for selection were identified. They showed a directional shift in frequency from the wild to the cultivated, likely reflecting diversifying selection imposed in the course of domestication. A genetic linkage map constructed using 188 F2 plants comprised 431 markers with an average distance of 1.1 cM, divided into nine linkage groups. Using previously anchored single nucleotide polymorphisms, the linkage groups were physically attributed to the nine carrot chromosomes. A cluster of markers mapping to chromosome 8 showed significant segregation distortion. Two of the 27 DArT markers with signatures for selection were segregating in the mapping population and were localized on chromosomes 2 and 6. Chromosome 2 was previously shown to carry the Vrn1 gene governing the biennial growth habit essential for cultivated carrot. The results reported here provide background for further research on the history of carrot domestication and identify genomic regions potentially important for modern carrot breeding

    Characterization of the transcriptome, nucleotide sequence polymorphism, and natural selection in the desert adapted mouse Peromyscus eremicus

    Get PDF
    As a direct result of intense heat and aridity, deserts are thought to be among the most harsh of environments, particularly for their mammalian inhabitants. Given that osmoregulation can be challenging for these animals, with failure resulting in death, strong selection should be observed on genes related to the maintenance of water and solute balance. One such animal, Peromyscus eremicus, is native to the desert regions of the southwest United States and may live its entire life without oral fluid intake. As a first step toward understanding the genetics that underlie this phenotype, we present a characterization of the P. eremicus transcriptome. We assay four tissues (kidney, liver, brain, testes) from a single individual and supplement this with population level renal transcriptome sequencing from 15 additional animals. We identified a set of transcripts undergoing both purifying and balancing selection based on estimates of Tajima’s D. In addition, we used the branch-site test to identify a transcript—Slc2a9, likely related to desert osmoregulation—undergoing enhanced selection in P. eremicus relative to a set of related non-desert rodents

    De novo assembly and characterization of leaf transcriptome for the development of functional molecular markers of the extremophile multipurpose tree species Prosopis alba

    Get PDF
    Background: Prosopis alba (Fabaceae) is an important native tree adapted to arid and semiarid regions of north-western Argentina which is of great value as multipurpose species. Despite its importance, the genomic resources currently available for the entire Prosopis genus are still limited. Here we describe the development of a leaf transcriptome and the identification of new molecular markers that could support functional genetic studies in natural and domesticated populations of this genus. Results: Next generation DNA pyrosequencing technology applied to P. alba transcripts produced a total of 1,103,231 raw reads with an average length of 421 bp. De novo assembling generated a set of 15,814 isotigs and 71,101 non-assembled sequences (singletons) with an average of 991 bp and 288 bp respectively. A total of 39,000 unique singletons were identified after clustering natural and artificial duplicates from pyrosequencing reads. Regarding the non-redundant sequences or unigenes, 22,095 out of 54,814 were successfully annotated with Gene Ontology terms. Moreover, simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) were searched, resulting in 5,992 and 6,236 markers, respectively, throughout the genome. For the validation of the the predicted SSR markers, a subset of 87 SSRs selected through functional annotation evidence was successfully amplified from six DNA samples of seedlings. From this analysis, 11 of these 87 SSRs were identified as polymorphic. Additionally, another set of 123 nuclear polymorphic SSRs were determined in silico, of which 50% have the probability of being effectively polymorphic. Conclusions: This study generated a successful global analysis of the P. alba leaf transcriptome after bioinformatic and wet laboratory validations of RNA-Seq data. The limited set of molecular markers currently available will be significantly increased with the thousands of new markers that were identified in this study. This information will strongly contribute to genomics resources for P. alba functional analysis and genetics. Finally, it will also potentially contribute to the development of population-based genome studies in the genera.Fil: Torales, Susana. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación de Recursos Naturales. Instituto de Recursos Biológicos; ArgentinaFil: Rivarola, Maximo Lisandro. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Pomponio, María Florencia. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación de Recursos Naturales. Instituto de Recursos Biológicos; ArgentinaFil: González, Sergio Alberto. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Acuña, Cintia Vanesa. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Fernández, Paula del Carmen. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: López Lauenstein, Diego. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigaciones Agropecuarias. Instituto de Fisiología y Recursos Genéticos Vegetales; ArgentinaFil: Verga, Aníbal Ramón. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigaciones Agropecuarias. Instituto de Fisiología y Recursos Geneticos Vegetales; ArgentinaFil: Hopp, Horacio Esteban. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; ArgentinaFil: Paniego, Norma Beatriz. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Marcucci Poltri, Susana Noemí. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación en Ciencias Veterinarias y Agronómicas. Instituto de Biotecnología; Argentin

    FAST: FAST Analysis of Sequences Toolbox.

    Get PDF
    FAST (FAST Analysis of Sequences Toolbox) provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU's Not Unix) Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R, and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics make FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format). Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought
    corecore