Search CORE

199,093 research outputs found

In vitro identification and in silico utilization of interspecies sequence similarities using GeneChip(® )technology

Author: Garcia Joe GN
Grigoryev Dmitry N
Irizarry Rafael A
Ma Shwu-Fan
Simon Brett A
Ye Shui Q
Publication venue: BioMed Central
Publication date: 01/05/2005
Field of study

BACKGROUND: Genomic approaches in large animal models (canine, ovine etc) are challenging due to insufficient genomic information for these species and the lack of availability of corresponding microarray platforms. To address this problem, we speculated that conserved interspecies genetic sequences can be experimentally detected by cross-species hybridization. The Affymetrix platform probe redundancy offers flexibility in selecting individual probes with high sequence similarities between related species for gene expression analysis. RESULTS: Gene expression profiles of 40 canine samples were generated using the human HG-U133A GeneChip (U133A). Due to interspecies genetic differences, only 14 ± 2% of canine transcripts were detected by U133A probe sets whereas profiling of 40 human samples detected 49 ± 6% of human transcripts. However, when these probe sets were deconstructed into individual probes and examined performance of each probe, we found that 47% of human probes were able to find their targets in canine tissues and generate a detectable hybridization signal. Therefore, we restricted gene expression analysis to these probes and observed the 60% increase in the number of identified canine transcripts. These results were validated by comparison of transcripts identified by our restricted analysis of cross-species hybridization with transcripts identified by hybridization of total lung canine mRNA to new Affymetrix Canine GeneChip(®). CONCLUSION: The experimental identification and restriction of gene expression analysis to probes with detectable hybridization signal drastically increases transcript detection of canine-human hybridization suggesting the possibility of broad utilization of cross-hybridizations of related species using GeneChip technology

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

DNAGPT: A Generalized Pre-trained Tool for Versatile DNA Sequence Analysis Tasks

Author: He Bing
Qin Chenchen
Yao Jianhua
Zhang Daoan
Zhang Jianguo
Zhang Weitong
Zhao Yu
Publication venue
Publication date: 30/08/2023
Field of study

Pre-trained large language models demonstrate potential in extracting information from DNA sequences, yet adapting to a variety of tasks and data modalities remains a challenge. To address this, we propose DNAGPT, a generalized DNA pre-training model trained on over 200 billion base pairs from all mammals. By enhancing the classic GPT model with a binary classification task (DNA sequence order), a numerical regression task (guanine-cytosine content prediction), and a comprehensive token language, DNAGPT can handle versatile DNA analysis tasks while processing both sequence and numerical data. Our evaluation of genomic signal and region recognition, mRNA abundance regression, and artificial genomes generation tasks demonstrates DNAGPT's superior performance compared to existing models designed for specific downstream tasks, benefiting from pre-training using the newly designed model structure

arXiv.org e-Print Archive

The neuropeptide transcriptome of a model echinoderm, the sea urchin Strongylocentrotus purpuratus

Author: Elphick MR
Rowe ML
Publication venue: 'Elsevier BV'
Publication date: 01/12/2012
Field of study

The work reported here was supported by a grant from the University of London Central Research Fun

Queen Mary Research Online

Local Binary Patterns as a Feature Descriptor in Alignment-free Visualisation of Metagenomic Data

Author: Kouchaki Samaneh
Robertson David L.
Tapinos Avraam
Tirunagari Santosh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Shotgun sequencing has facilitated the analysis of complex microbial communities. However, clustering and visualising these communities without prior taxonomic information is a major challenge. Feature descriptor methods can be utilised to extract these taxonomic relations from the data. Here, we present a novel approach consisting of local binary patterns (LBP) coupled with randomised singular value decomposition (RSVD) and Barnes-Hut t-stochastic neighbor embedding (BH-tSNE) to highlight the underlying taxonomic structure of the metagenomic data. The effectiveness of our approach is demonstrated using several simulated and a real metagenomic datasets

Enlighten

Ks1, an epithelial cell-specific gene, responds to early signals of head formation in Hydra

Author: Bosch Thomas C. G.
David Charles N.
Salgado Luis M.
Weinziger Ruth
Publication venue
Publication date: 01/01/1994
Field of study

As a molecular marker for head specification in Hydra, we have cloned an epithelial cell-specific gene which responds to early signals of head formation. The gene, designated ks1, encodes a 217-amino acid protein lacking significant sequence similarity to any known protein. KS1 contains a N-terminal signal sequence and is rich in charged residues which are clustered in several domains. ks1 is expressed in tentacle-specific epithelial cells (battery cells) as well as in a small fraction of ectodermal epithelial cells in the gastric region subjacent to the tentacles. Treatment with the protein kinase C activator 12-O-tetradecanoylphorbol-13- acetate (TPA) causes a rapid increase in the level of ks1 mRNA in head-specific epithelial cells and also induces ectopic ks1 expression in cells of the gastric region. Sequence elements in the 5 ¢-flanking region of ks1 that are related to TPA-responsive elements may mediate the TPA inducibility of ks1 expression. The pattern of expression of ks1 suggests that a ligand-activated diacylglycerol second messenger system is involved in head-specific differentiation

Open Access LMU

Recommended from our members

The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization.

Author: Amirebrahimi Mojgan
Grimwood Jane
Jenkins Jerry
Kennedy Megan
Mattison Ashley
McCormick Ryan F
McKinley Brian
Morishige Daryl T
Mullet John E
Schmutz Jeremy
Shu Shengqiang
Sims David
Sreedasyam Avinash
Truong Sandra K
Weers Brock D
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

Sorghum bicolor is a drought tolerant C4 grass used for the production of grain, forage, sugar, and lignocellulosic biomass and a genetic model for C4 grasses due to its relatively small genome (approximately 800 Mbp), diploid genetics, diverse germplasm, and colinearity with other C4 grass genomes. In this study, deep sequencing, genetic linkage analysis, and transcriptome data were used to produce and annotate a high-quality reference genome sequence. Reference genome sequence order was improved, 29.6 Mbp of additional sequence was incorporated, the number of genes annotated increased 24% to 34 211, average gene length and N50 increased, and error frequency was reduced 10-fold to 1 per 100 kbp. Subtelomeric repeats with characteristics of Tandem Repeats in Miniature (TRIM) elements were identified at the termini of most chromosomes. Nucleosome occupancy predictions identified nucleosomes positioned immediately downstream of transcription start sites and at different densities across chromosomes. Alignment of more than 50 resequenced genomes from diverse sorghum genotypes to the reference genome identified approximately 7.4 M single nucleotide polymorphisms (SNPs) and 1.9 M indels. Large-scale variant features in euchromatin were identified with periodicities of approximately 25 kbp. A transcriptome atlas of gene expression was constructed from 47 RNA-seq profiles of growing and developed tissues of the major plant organs (roots, leaves, stems, panicles, and seed) collected during the juvenile, vegetative and reproductive phases. Analysis of the transcriptome data indicated that tissue type and protein kinase expression had large influences on transcriptional profile clustering. The updated assembly, annotation, and transcriptome data represent a resource for C4 grass research and crop improvement

eScholarship - University of California