62 research outputs found

    Gene-Based Association Tests Using New Polygenic Risk Scores and Incorporating Gene Expression Data

    Get PDF
    Recently, gene-based association studies have shown that integrating genome-wide association studies (GWAS) with expression quantitative trait locus (eQTL) data can boost statistical power and that the genetic liability of traits can be captured by polygenic risk scores (PRSs). In this paper, we propose a new gene-based statistical method that leverages gene-expression measure-ments and new PRSs to identify genes that are associated with phenotypes of interest. We used a generalized linear model to associate phenotypes with gene expression and PRSs and used a score-test statistic to test the association between phenotypes and genes. Our simulation studies show that the newly developed method has correct type I error rates and can boost statistical power compared with other methods that use either gene expression or PRS in association tests. A real data analysis Figurebased on UK Biobank data for asthma shows that the proposed method is applicable to GWAS

    Integrating External Controls by Regression Calibration for Genome-Wide Association Study

    Get PDF
    Genome-wide association studies (GWAS) have successfully revealed many disease-associated genetic variants. For a case-control study, the adequate power of an association test can be achieved with a large sample size, although genotyping large samples is expensive. A cost-effective strategy to boost power is to integrate external control samples with publicly available genotyped data. However, the naive integration of external controls may inflate the type I error rates if ignoring the systematic differences (batch effect) between studies, such as the differences in sequencing platforms, genotype-calling procedures, population stratification, and so forth. To account for the batch effect, we propose an approach by integrating External Controls into the Association Test by Regression Calibration (iECAT-RC) in case-control association studies. Extensive simulation studies show that iECAT-RC not only can control type I error rates but also can boost statistical power in all models. We also apply iECAT-RC to the UK Biobank data for M72 Fibroblastic disorders by considering genotype calling as the batch effect. Four SNPs associated with fibroblastic disorders have been detected by iECAT-RC and the other two comparison methods, iECAT-Score and Internal. However, our method has a higher probability of identifying these significant SNPs in the scenario of an unbalanced case-control association study

    Children’s Non-symbolic and Symbolic Numerical Representations and Their Associations With Mathematical Ability

    Get PDF
    Most empirical evidence supports the view that non-symbolic and symbolic representations are foundations for advanced mathematical ability. However, the detailed development trajectories of these two types of representations in childhood are not very clear, nor are the different effects of non-symbolic and symbolic representations on the development of mathematical ability. We assessed 253 4- to 8-year-old children’s non-symbolic and symbolic numerical representations, mapping skills, and mathematical ability, aiming to investigate the developmental trajectories and associations between these skills. Our results showed non-symbolic numerical representation emerged earlier than the symbolic one. Four-year-olds were capable of non-symbolic comparisons but not symbolic comparisons; five-year-olds performed better at non-symbolic comparisons than symbolic comparisons. This performance difference disappeared at age 6. Children at age 6 or older were able to map between symbolic and non-symbolic quantities. However, as children learn more about the symbolic representation system, their advantage in non-symbolic representation disappeared. Path analyses revealed that a direct effect of children’s symbolic numerical skills on their math performance, and an indirect effect of non-symbolic numerical skills on math performance via symbolic skills. These results suggest that symbolic numerical skills are a predominant factor affecting math performance in early childhood. However, the influences of symbolic and non-symbolic numerical skills on mathematical performance both declines with age

    Exploiting Multiple Embeddings for Chinese Named Entity Recognition

    Full text link
    Identifying the named entities mentioned in text would enrich many semantic applications at the downstream level. However, due to the predominant usage of colloquial language in microblogs, the named entity recognition (NER) in Chinese microblogs experience significant performance deterioration, compared with performing NER in formal Chinese corpus. In this paper, we propose a simple yet effective neural framework to derive the character-level embeddings for NER in Chinese text, named ME-CNER. A character embedding is derived with rich semantic information harnessed at multiple granularities, ranging from radical, character to word levels. The experimental results demonstrate that the proposed approach achieves a large performance improvement on Weibo dataset and comparable performance on MSRA news dataset with lower computational cost against the existing state-of-the-art alternatives.Comment: accepted at CIKM 201

    Loss‐of‐Function Genetic Screening Identifies Aldolase A as an Essential Driver for Liver Cancer Cell Growth Under Hypoxia

    Get PDF
    Background and aims: Hypoxia is a common feature of the tumor microenvironment (TME), which promotes tumor progression, metastasis, and therapeutic drug resistance through a myriad of cell activities in tumor and stroma cells. While targeting hypoxic TME is emerging as a promising strategy for treating solid tumors, preclinical development of this approach is lacking in the study of HCC. Approach and results: From a genome-wide CRISPR/CRISPR-associated 9 gene knockout screening, we identified aldolase A (ALDOA), a key enzyme in glycolysis and gluconeogenesis, as an essential driver for HCC cell growth under hypoxia. Knockdown of ALDOA in HCC cells leads to lactate depletion and consequently inhibits tumor growth. Supplementation with lactate partly rescues the inhibitory effects mediated by ALDOA knockdown. Upon hypoxia, ALDOA is induced by hypoxia-inducible factor-1α and fat mass and obesity-associated protein-mediated N6 -methyladenosine modification through transcriptional and posttranscriptional regulation, respectively. Analysis of The Cancer Genome Atlas shows that elevated levels of ALDOA are significantly correlated with poor prognosis of patients with HCC. In a screen of Food and Drug Administration-approved drugs based on structured hierarchical virtual platforms, we identified the sulfamonomethoxine derivative compound 5 (cpd-5) as a potential inhibitor to target ALDOA, evidenced by the antitumor activity of cpd-5 in preclinical patient-derived xenograft models of HCC. Conclusions: Our work identifies ALDOA as an essential driver for HCC cell growth under hypoxia, and we demonstrate that inhibition of ALDOA in the hypoxic TME is a promising therapeutic strategy for treating HCC

    RNA-binding protein RALY reprogrammes mitochondrial metabolism via mediating miRNA processing in colorectal cancer

    Get PDF
    Objective: Dysregulated cellular metabolism is a distinct hallmark of human colorectal cancer (CRC). However, metabolic programme rewiring during tumour progression has yet to be fully understood. Design: We analysed altered gene signatures during colorectal tumour progression, and used a complex of molecular and metabolic assays to study the regulation of metabolism in CRC cell lines, human patient-derived xenograft mouse models and tumour organoid models. Results: We identified a novel RNA-binding protein, RALY (also known as hnRNPCL2), that is highly associated with colorectal tumour aggressiveness. RALY acts as a key regulatory component in the Drosha complex, and promotes the post-transcriptional processing of a specific subset of miRNAs (miR-483, miR-676 and miR-877). These miRNAs systematically downregulate the expression of the metabolism-associated genes (ATP5I, ATP5G1, ATP5G3 and CYC1) and thereby reprogramme mitochondrial metabolism in the cancer cell. Analysis of The Cancer Genome Atlas (TCGA) reveals that increased levels of RALY are associated with poor prognosis in the patients with CRC expressing low levels of mitochondrion-associated genes. Mechanistically, induced processing of these miRNAs is facilitated by their N6-methyladenosine switch under reactive oxygen species (ROS) stress. Inhibition of the m6A methylation abolishes the RALY recognition of the terminal loop of the pri-miRNAs. Knockdown of RALY inhibits colorectal tumour growth and progression in vivo and in organoid models. Conclusions: Collectively, our results reveal a critical metabolism-centric role of RALY in tumour progression, which may lead to cancer therapeutics targeting RALY for treating CRC

    STATISTICAL METHODS FOR CONTROLLING POPULATION STRATIFICATION AND GENE-BASED ASSOCIATION STUDIES

    No full text
    This dissertation includes three papers with each distributed in one chapter. In chapter 1, we use extensive simulation studies and real data studies to evaluate the performance of using the linkage disequilibrium score regression (LDSC) for controlling population stratification. In chapter 2, we propose a gene-based statistical method that leverage gene expression (GE) measurements and polygenic risk scores (PRS) to identify genes that are associated with a phenotype of interest. In simulation studies, the proposed method has correct type I error rates and can boost power comparing to other methods that use either gene expression or PRS in association tests. The real data analysis based on UK Biobank data for the asthma disease shows that the proposed method is also applicable to GWAS. In chapter 3, we analytically derive the distribution of TOW test statistics and modify TOW to utilize GWAS summary statistics (TOW-S). Simulation studies show that TOW-S has correct type I error rates and can retain power among all scenarios

    Control for population stratification in genetic association studies based on GWAS summary statistics

    No full text
    Over the past years, genome-wide association studies (GWAS) have generated a wealth of new information. Summary data from many GWAS are now publicly available, promoting the development of many statistical methods for association studies based on GWAS summary statistics, which avoids the increasing challenges associated with individual-level genotype and phenotype data sharing. However, for population-based association studies such as GWAS, it has been long recognized that population stratification can seriously confound association results. For large GWAS, it is very likely that there exist population stratification and cryptic relatedness, which will result in inflated Type I error in association testing. Although many methods have been developed to control for population stratification, only two of these approaches can be used to control population stratification without individual-level data: one is based on genomic control (GC) and the other one is based on linkage disequilibrium score regression (LDSC). However, the performance of these two approaches is currently unknown. In this study, we use extensive simulation studies including populations with subpopulations, spatially structured populations, and populations with cryptic relatedness to compare the performance of these two approaches to control for population stratification using only GWAS summary statistics without individual-level data. Data sets from the genetic analysis workshop 19 and UK Biobank are also used to evaluate these two approaches. We demonstrate that the intercept of LDSC can be used as a more accurate correction factor than GC. The results from this study will provide very useful information for researchers using GWAS summary statistics while trying to control for population stratification

    A study on the mechanical behaviour of mixed fiber-reinforced soil

    No full text
    Mixed fiber reinforced technique has been widely used in reinforcing the concrete due to its excellent performance in enhancing the strength, durability and stiffness. The improvement of fiber-reinforced soil (FRS) on the mechanical behaviour (e.g., stiffness and ductility) is limited due to the properties of single fibers. However, much studies focus on the single fiber based reinforced soil, two fibers or more mixing with soil do not be covered. Mixed fiber-reinforced soil (MFRS) is defined as the improvement of FRS, in which two different types of fibers are mixed into the soil to improve the shear strength and stiffness simultaneously. The tests were conducted using clay soil and Yongjiang sand, which were collected from the practical construction site of metro and Yongjiang River, respectively. The test results show that under the same test conditions (e.g., void ratio, confining pressure and fiber content), the MFRS always show higher deviator stress than the FRS. As carbon fibers and polypropylene fibers give higher stiffness and tensile strength respectively. Besides, the friction angle and cohesion of MFRS are also affected by fiber content. It is concluded that half of the carbon fiber content of MFRS has the same performance in the stress-strain relationship as the FRS with 100% carbon fiber content. However, the effectiveness of the reinforcement to clay soil is insignificant. As the reinforcements is relatively dependent on the characteristics of clay soil (e.g., mean diameter, coefficient uniformity). For the use of MFRS, it could decrease the cost of purchasing special fiber such as pure carbon fiber and aramid fiber, which are expensive, and it could be used in the long-terms constructions, which are required to last at least 100 years
    • 

    corecore