115 research outputs found

    Revealing the missing expressed genes beyond the human reference genome by RNA-Seq

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The complete and accurate human reference genome is important for functional genomics researches. Therefore, the incomplete reference genome and individual specific sequences have significant effects on various studies.</p> <p>Results</p> <p>we used two RNA-Seq datasets from human brain tissues and 10 mixed cell lines to investigate the completeness of human reference genome. First, we demonstrated that in previously identified ~5 Mb Asian and ~5 Mb African novel sequences that are absent from the human reference genome of NCBI build 36, ~211 kb and ~201 kb of them could be transcribed, respectively. Our results suggest that many of those transcribed regions are not specific to Asian and African, but also present in Caucasian. Then, we found that the expressions of 104 RefSeq genes that are unalignable to NCBI build 37 in brain and cell lines are higher than 0.1 RPKM. 55 of them are conserved across human, chimpanzee and macaque, suggesting that there are still a significant number of functional human genes absent from the human reference genome. Moreover, we identified hundreds of novel transcript contigs that cannot be aligned to NCBI build 37, RefSeq genes and EST sequences. Some of those novel transcript contigs are also conserved among human, chimpanzee and macaque. By positioning those contigs onto the human genome, we identified several large deletions in the reference genome. Several conserved novel transcript contigs were further validated by RT-PCR.</p> <p>Conclusion</p> <p>Our findings demonstrate that a significant number of genes are still absent from the incomplete human reference genome, highlighting the importance of further refining the human reference genome and curating those missing genes. Our study also shows the importance of <it>de novo </it>transcriptome assembly. The comparative approach between reference genome and other related human genomes based on the transcriptome provides an alternative way to refine the human reference genome.</p

    Skipping of Chinese characters does not rely on word-based processing

    Get PDF
    © 2017 The Psychonomic Society, Inc. Previous eye-movement studies have indicated that people tend to skip extremely high-frequency words in sentence reading, such as “the” in English and “的/de” in Chinese. Two alternative hypotheses have been proposed to explain how this frequent skipping happens in Chinese reading: one assumes that skipping happens when the preview has been fully identified at the word level (word-based skipping); the other assumes that skipping happens whenever the preview character is easy to identify regardless of whether lexical processing has been completed or not (character-based skipping). Using the gaze-contingent display change paradigm, we examined the two hypotheses by substituting the preview of the third character of a four-character Chinese word with the high-frequency Chinese character “的/de”, which should disrupt the ongoing word-level processing. The character-based skipping hypothesis predicts that this manipulation will enhance the skipping probability of the target character (i.e., the third character of the target word), because the character “的/de” has much higher character frequency than the original character. The word-based skipping hypothesis instead predicts a reduction of the skipping probability of the target character because the presence of the character “的/de” is lexically infelicitous at word level. The results supported the character-based skipping hypothesis, indicating that in Chinese reading the decision of skipping a character can be made before integrating it into a word

    Synergistic Effect of SRY and Its Direct Target, WDR5, on Sox9 Expression

    Get PDF
    SRY is a sex-determining gene that encodes a transcription factor, which triggers male development in most mammals. The molecular mechanism of SRY action in testis determination is, however, poorly understood. In this study, we demonstrate that WDR5, which encodes a WD-40 repeat protein, is a direct target of SRY. EMSA experiments and ChIP assays showed that SRY could bind to the WDR5 gene promoter directly. Overexpression of SRY in LNCaP cells significantly increased WDR5 expression concurrent with histone H3K4 methylation on the WDR5 promoter. To specifically address whether SRY contributes to WDR5 regulation, we introduced a 4-hydroxy-tamoxifen-inducible SRY allele into LNCaP cells. Conditional SRY expression triggered enrichment of SRY on the WDR5 promoter resulting in induction of WDR5 transcription. We found that WDR5 was self regulating through a positive feedback loop. WDR5 and SRY interacted and were colocalized in cells. In addition, the interaction of WDR5 with SRY resulted in activation of Sox9 while repressing the expression of β-catenin. These results suggest that, in conjunction with SRY, WDR5 plays an important role in sex determination

    A heterozygous moth genome provides insights into herbivory and detoxification

    Get PDF
    How an insect evolves to become a successful herbivore is of profound biological and practical importance. Herbivores are often adapted to feed on a specific group of evolutionarily and biochemically related host plants1, but the genetic and molecular bases for adaptation to plant defense compounds remain poorly understood2. We report the first whole-genome sequence of a basal lepidopteran species, Plutella xylostella, which contains 18,071 protein-coding and 1,412 unique genes with an expansion of gene families associated with perception and the detoxification of plant defense compounds. A recent expansion of retrotransposons near detoxification-related genes and a wider system used in the metabolism of plant defense compounds are shown to also be involved in the development of insecticide resistance. This work shows the genetic and molecular bases for the evolutionary success of this worldwide herbivore and offers wider insights into insect adaptation to plant feeding, as well as opening avenues for more sustainable pest management.Minsheng You … Simon W Baxter … et al

    The oyster genome reveals stress adaptation and complexity of shell formation

    Get PDF
    The Pacific oyster Crassostrea gigas belongs to one of the most species-rich but genomically poorly explored phyla, the Mollusca. Here we report the sequencing and assembly of the oyster genome using short reads and a fosmid-pooling strategy, along with transcriptomes of development and stress response and the proteome of the shell. The oyster genome is highly polymorphic and rich in repetitive sequences, with some transposable elements still actively shaping variation. Transcriptome studies reveal an extensive set of genes responding to environmental stress. The expansion of genes coding for heat shock protein 70 and inhibitors of apoptosis is probably central to the oyster's adaptation to sessile life in the highly stressful intertidal zone. Our analyses also show that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes. The oyster genome sequence fills a void in our understanding of the Lophotrochozoa. © 2012 Macmillan Publishers Limited. All rights reserved

    Improving Estimations of Spatial Distribution of Soil Respiration Using the Bayesian Maximum Entropy Algorithm and Soil Temperature as Auxiliary Data

    Get PDF
    This study was supported by the NSF China Programs (Grant No. 31300539 and 31570629) and the Public Welfare Technology Application Research Program of Zhejiang province (Grant No. 2015C31004).Soil respiration inherently shows strong spatial variability. It is difficult to obtain an accurate characterization of soil respiration with an insufficient number of monitoring points. However, it is expensive and cumbersome to deploy many sensors. To solve this problem, we proposed employing the Bayesian Maximum Entropy (BME) algorithm, using soil temperature as auxiliary information, to study the spatial distribution of soil respiration. The BME algorithm used the soft data (auxiliary information) effectively to improve the estimation accuracy of the spatiotemporal distribution of soil respiration. Based on the functional relationship between soil temperature and soil respiration, the BME algorithm satisfactorily integrated soil temperature data into said spatial distribution. As a means of comparison, we also applied the Ordinary Kriging (OK) and Co-Kriging (Co-OK) methods. The results indicated that the root mean squared errors (RMSEs) and absolute values of bias for both Day 1 and Day 2 were the lowest for the BME method, thus demonstrating its higher estimation accuracy. Further, we compared the performance of the BME algorithm coupled with auxiliary information, namely soil temperature data, and the OK method without auxiliary information in the same study area for 9, 21, and 37 sampled points. The results showed that the RMSEs for the BME algorithm (0.972 and 1.193) were less than those for the OK method (1.146 and 1.539) when the number of sampled points was 9 and 37, respectively. This indicates that the former method using auxiliary information could reduce the required number of sampling points for studying spatial distribution of soil respiration. Thus, the BME algorithm, coupled with soil temperature data, can not only improve the accuracy of soil respiration spatial interpolation but can also reduce the number of sampling points.Yeshttp://www.plosone.org/static/editorial#pee
    corecore