56 research outputs found

    DDBJ Activities: Contribution to the Research in Information Biology

    Get PDF
    DDBJ (DNA Data Bank of Japan; "http://www.ddbj.nig.ac.jp/":http://www.ddbj.nig.ac.jp/)started its database activities in 1986. From the beginning, DDBJ has been one of INSDC (International Nucleotide Sequence Database Collaboration; "http://www.insdc.org/":http://www.insdc.org/) that is a tripartite collaboration with EMBL-Bank/EBI and GenBank/NCBI.The total base number of the primary nucleotide sequence data collected and distributed by INSDC exceeded 100 Gbases in August 2005. Since then it took only three years for the total base number to be doubled (200 Gbases). Now, the collaboration is being expanded to Traces (DNA sequence chromatograms) and Short Reads (raw reads data from 454, Solexa, SOLiD etc). DDBJ is also collecting and releasing gene expression data at CIBEX (Center for Information Biology gene EXpression database; "http://cibex.nig.ac.jp/":http://cibex.nig.ac.jp/). Furthermore, DDBJ contributed to international annotation jamborees such as FANTOM (mouse), H-Inv (human), RAP (rice) and E. coli K12. DDBJ provides many services to the research in information biology or bioinformatics. They include Web-API for Biology (WABI) "http://www.xml.nig.ac.jp/":http://www.xml.nig.ac.jp/ and All-round Retrieval of Sequence and Annotation (ARSA) "http://arsa.ddbj.nig.ac.jp/":http://arsa.ddbj.nig.ac.jp/. These activities are presented with the perspective of DDBJ in the coming years

    Characterization of TRPA channels in the starfish Patiria pectinifera: involvement of thermally activated TRPA1 in thermotaxis in marine planktonic larvae.

    Get PDF
    The vast majority of marine invertebrates spend their larval period as pelagic plankton and are exposed to various environmental cues. Here we investigated the thermotaxis behaviors of the bipinnaria larvae of the starfish, Patiria pectinifera, in association with TRPA ion channels that serve as thermal receptors in various animal species. Using a newly developed thermotaxis assay system, we observed that P. pectinifera larvae displayed positive thermotaxis toward high temperatures, including toward temperatures high enough to cause death. In parallel, we identified two TRPA genes, termed PpTRPA1 and PpTRPA basal, from this species. We examined the phylogenetic position, spatial expression, and channel properties of each PpTRPA. Our results revealed the following: (1) The two genes diverged early in animal evolution; (2) PpTRPA1 and PpTRPA basal are expressed in the ciliary band and posterior digestive tract of the larval body, respectively; and (3) PpTRPA1 is activated by heat stimulation as well as by known TRPA1 agonists. Moreover, knockdown and rescue experiments demonstrated that PpTRPA1 is involved in positive thermotaxis in P. pectinifera larvae. This is the first report to reveal that TRPA1 channels regulate the behavioral response of a marine invertebrate to temperature changes during its planktonic larval period

    H-DBAS: Alternative splicing database of completely sequenced and manually annotated full-length cDNAs based on H-Invitational

    Get PDF
    The Human-transcriptome DataBase for Alternative Splicing (H-DBAS) is a specialized database of alternatively spliced human transcripts. In this database, each of the alternative splicing (AS) variants corresponds to a completely sequenced and carefully annotated human full-length cDNA, one of those collected for the H-Invitational human-transcriptome annotation meeting. H-DBAS contains 38 664 representative alternative splicing variants (RASVs) in 11 744 loci, in total. The data is retrievable by various features of AS, which were annotated according to manual annotations, such as by patterns of ASs, consequently invoked alternations in the encoded amino acids and affected protein motifs, GO terms, predicted subcellular localization signals and transmembrane domains. The database also records recently identified very complex patterns of AS, in which two distinct genes seemed to be bridged, nested or degenerated (multiple CDS): in all three cases, completely unrelated proteins are encoded by a single locus. By using AS Viewer, each AS event can be analyzed in the context of full-length cDNAs, enabling the user's empirical understanding of the relation between AS event and the consequent alternations in the encoded amino acid sequences together with various kinds of affected protein motifs. H-DBAS is accessible at

    mtDNA diversity of the Zapotec in Mexico suggests a population decline long before the first contact with Europeans.

    Get PDF
    The New World is the last continent colonized by anatomically modern humans, Homo sapiens. The first migrants entered the New World from Asia through Beringia. It is suggested that there were three streams of Asian gene flow, one major and two additional minor gene flows. The first major migrants took a Pacific coastal route and began spreading to the American continent before the opening of the ice-free corridor. We investigated the diversity of full-length mitochondrial DNA genomes of the Zapotec population, residing in the Mesoamerican region, and reconstructed their demographic history using Bayesian Skyline Plots. We estimated the initial date of gene flow into the New World by Zapotec ancestors at around 17 000–19 000 years ago,which is highly concordant with previous studies. We also show a population decline after the initial expansion. This decline started 4000 years ago, long before European contact with Native Americans. This indicates that other factors including climatec hange should be considered to explain the observed demographic pattern

    DDBJ launches a new archive database with analytical tools for next-generation sequence data

    Get PDF
    The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has collected and released 1 701 110 entries/1 116 138 614 bases between July 2008 and June 2009. A few highlighted data releases from DDBJ were the complete genome sequence of an endosymbiont within protist cells in the termite gut and Cap Analysis Gene Expression tags for human and mouse deposited from the Functional Annotation of the Mammalian cDNA consortium. In this period, we started a novel user announcement service using Really Simple Syndication (RSS) to deliver a list of data released from DDBJ on a daily basis. Comprehensive visualization of a DDBJ release data was attempted by using a word cloud program. Moreover, a new archive for sequencing data from next-generation sequencers, the β€˜DDBJ Read Archive’ (DRA), was launched. Concurrently, for read data registered in DRA, a semi-automatic annotation tool called the β€˜DDBJ Read Annotation Pipeline’ was released as a preliminary step. The pipeline consists of two parts: basic analysis for reference genome mapping and de novo assembly and high-level analysis of structural and functional annotations. These new services will aid users’ research and provide easier access to DDBJ databases

    H-DBAS: human-transcriptome database for alternative splicing: update 2010

    Get PDF
    H-DBAS (http://h-invitational.jp/h-dbas/) is a specialized database for human alternative splicing (AS) based on H-Invitational full-length cDNAs. In this update, for better annotations of AS events, we correlated RNA-Seq tag information to the AS exons and splice junctions. We generated a total of 148 376 598 RNA-Seq tags from RNAs extracted from cytoplasmic, nuclear and polysome fractions. Analysis of the RNA-Seq tags allowed us to identify 90 900 exons that are very likely to be used for protein synthesis. On the other hand, 254 AS junctions of human RefSeq transcripts are unique to nuclear RNA and may not have any translational consequences. We also present a new comparative genomics viewer so that users can empirically understand the evolutionary turnover of AS. With the unique experimental data closely connected with intensively curated cDNA information, H-DBAS provides a unique platform for the analysis of complex AS

    Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana

    Get PDF
    We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ~32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene

    H-InvDB in 2009: extended database and data mining resources for human genes and transcripts

    Get PDF
    We report the extended database and data mining resources newly released in the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). H-InvDB is a comprehensive annotation resource of human genes and transcripts, and consists of two main views and six sub-databases. The latest release of H-InvDB (release 6.2) provides the annotation for 219 765 human transcripts in 43 159 human gene clusters based on human full-length cDNAs and mRNAs. H-InvDB now provides several new annotation features, such as mapping of microarray probes, new gene models, relation to known ncRNAs and information from the Glycogene database. H-InvDB also provides useful data mining resourcesβ€”β€˜Navigation search’, β€˜H-InvDB Enrichment Analysis Tool (HEAT)’ and web service APIs. β€˜Navigation search’ is an extended search system that enables complicated searches by combining 16 different search options. HEAT is a data mining tool for automatically identifying features specific to a given human gene set. HEAT searches for H-InvDB annotations that are significantly enriched in a user-defined gene set, as compared with the entire H-InvDB representative transcripts. H-InvDB now has web service APIs of SOAP and REST to allow the use of H-InvDB data in programs, providing the users extended data accessibility

    Large-scale identification and characterization of alternative splicing variants of human gene transcripts using 56 419 completely sequenced and manually annotated full-length cDNAs

    Get PDF
    We report the first genome-wide identification and characterization of alternative splicing in human gene transcripts based on analysis of the full-length cDNAs. Applying both manual and computational analyses for 56 419 completely sequenced and precisely annotated full-length cDNAs selected for the H-Invitational human transcriptome annotation meetings, we identified 6877 alternative splicing genes with 18 297 different alternative splicing variants. A total of 37 670 exons were involved in these alternative splicing events. The encoded protein sequences were affected in 6005 of the 6877 genes. Notably, alternative splicing affected protein motifs in 3015 genes, subcellular localizations in 2982 genes and transmembrane domains in 1348 genes. We also identified interesting patterns of alternative splicing, in which two distinct genes seemed to be bridged, nested or having overlapping protein coding sequences (CDSs) of different reading frames (multiple CDS). In these cases, completely unrelated proteins are encoded by a single locus. Genome-wide annotations of alternative splicing, relying on full-length cDNAs, should lay firm groundwork for exploring in detail the diversification of protein function, which is mediated by the fast expanding universe of alternative splicing variants

    Multiple Loci within the Major Histocompatibility Complex Confer Risk of Psoriasis

    Get PDF
    Psoriasis is a common inflammatory skin disease characterized by thickened scaly red plaques. Previously we have performed a genome-wide association study (GWAS) on psoriasis with 1,359 cases and 1,400 controls, which were genotyped for 447,249 SNPs. The most significant finding was for SNP rs12191877, which is in tight linkage disequilibrium with HLA-Cw*0602, the consensus risk allele for psoriasis. However, it is not known whether there are other psoriasis loci within the MHC in addition to HLA-C. In the present study, we searched for additional susceptibility loci within the human leukocyte antigen (HLA) region through in-depth analyses of the GWAS data; then, we followed up our findings in an independent Han Chinese 1,139 psoriasis cases and 1,132 controls. Using the phased CEPH dataset as a reference, we imputed the HLA-Cw*0602 in all samples with high accuracy. The association of the imputed HLA-Cw*0602 dosage with disease was much stronger than that of the most significantly associated SNP, rs12191877. Adjusting for HLA-Cw*0602, there were two remaining association signals: one demonstrated by rs2073048 (pβ€Š=β€Š2Γ—10βˆ’6, ORβ€Š=β€Š0.66), located within c6orf10, a potential downstream effecter of TNF-alpha, and one indicated by rs13437088 (pβ€Š=β€Š9Γ—10βˆ’6, ORβ€Š=β€Š1.3), located 30 kb centromeric of HLA-B and 16 kb telomeric of MICA. When HLA-Cw*0602, rs2073048, and rs13437088 were all included in a logistic regression model, each of them was significantly associated with disease (pβ€Š=β€Š3Γ—10βˆ’47, 6Γ—10βˆ’8, and 3Γ—10βˆ’7, respectively). Both putative loci were also significantly associated in the Han Chinese samples after controlling for the imputed HLA-Cw*0602. A detailed analysis of HLA-B in both populations demonstrated that HLA-B*57 was associated with an increased risk of psoriasis and HLA-B*40 a decreased risk, independently of HLA-Cw*0602 and the C6orf10 locus, suggesting the potential pathogenic involvement of HLA-B. These results demonstrate that there are at least two additional loci within the MHC conferring risk of psoriasis
    • …
    corecore