64 research outputs found
Gene-Based Association Tests Using New Polygenic Risk Scores and Incorporating Gene Expression Data
Recently, gene-based association studies have shown that integrating genome-wide association studies (GWAS) with expression quantitative trait locus (eQTL) data can boost statistical power and that the genetic liability of traits can be captured by polygenic risk scores (PRSs). In this paper, we propose a new gene-based statistical method that leverages gene-expression measure-ments and new PRSs to identify genes that are associated with phenotypes of interest. We used a generalized linear model to associate phenotypes with gene expression and PRSs and used a score-test statistic to test the association between phenotypes and genes. Our simulation studies show that the newly developed method has correct type I error rates and can boost statistical power compared with other methods that use either gene expression or PRS in association tests. A real data analysis Figurebased on UK Biobank data for asthma shows that the proposed method is applicable to GWAS
Integrating External Controls by Regression Calibration for Genome-Wide Association Study
Genome-wide association studies (GWAS) have successfully revealed many disease-associated genetic variants. For a case-control study, the adequate power of an association test can be achieved with a large sample size, although genotyping large samples is expensive. A cost-effective strategy to boost power is to integrate external control samples with publicly available genotyped data. However, the naive integration of external controls may inflate the type I error rates if ignoring the systematic differences (batch effect) between studies, such as the differences in sequencing platforms, genotype-calling procedures, population stratification, and so forth. To account for the batch effect, we propose an approach by integrating External Controls into the Association Test by Regression Calibration (iECAT-RC) in case-control association studies. Extensive simulation studies show that iECAT-RC not only can control type I error rates but also can boost statistical power in all models. We also apply iECAT-RC to the UK Biobank data for M72 Fibroblastic disorders by considering genotype calling as the batch effect. Four SNPs associated with fibroblastic disorders have been detected by iECAT-RC and the other two comparison methods, iECAT-Score and Internal. However, our method has a higher probability of identifying these significant SNPs in the scenario of an unbalanced case-control association study
Tuning of spin-orbit coupling in metal-free conjugated polymers by structural conformation
Manipulating spin-orbit coupling (SOC) is a key achievement for spin-orbitronic applications since SOC determines spin-diffusion lengths and spin-to-charge conversion efficiencies. While in most organic semiconductors SOC is inherently very weak due to being composed of primarily light elements, the SOC in conjugated polymer systems is also intimately tied to the polymer's structural conformation and thus may be manipulated. Here we report a modification of SOC in conjugated polymers by altering torsion angle between conjugated units. Spin-pumping experiments are performed on three poly(3-alkylthiophene) polymer films with decreasing conjugation lengths and concomitantly increasing torsion angle. The more twisted polymer exhibits a shorter spin-diffusion length and a giant spin-mixing conductance (up to 1021m-2 ), which is attributed to an increased SOC by structural conformation. This work offers a route for enhancing SOC and spin-injection efficiency in organic materials for spintronic applications
Childrenâs Non-symbolic and Symbolic Numerical Representations and Their Associations With Mathematical Ability
Most empirical evidence supports the view that non-symbolic and symbolic representations are foundations for advanced mathematical ability. However, the detailed development trajectories of these two types of representations in childhood are not very clear, nor are the different effects of non-symbolic and symbolic representations on the development of mathematical ability. We assessed 253 4- to 8-year-old childrenâs non-symbolic and symbolic numerical representations, mapping skills, and mathematical ability, aiming to investigate the developmental trajectories and associations between these skills. Our results showed non-symbolic numerical representation emerged earlier than the symbolic one. Four-year-olds were capable of non-symbolic comparisons but not symbolic comparisons; five-year-olds performed better at non-symbolic comparisons than symbolic comparisons. This performance difference disappeared at age 6. Children at age 6 or older were able to map between symbolic and non-symbolic quantities. However, as children learn more about the symbolic representation system, their advantage in non-symbolic representation disappeared. Path analyses revealed that a direct effect of childrenâs symbolic numerical skills on their math performance, and an indirect effect of non-symbolic numerical skills on math performance via symbolic skills. These results suggest that symbolic numerical skills are a predominant factor affecting math performance in early childhood. However, the influences of symbolic and non-symbolic numerical skills on mathematical performance both declines with age
Exploiting Multiple Embeddings for Chinese Named Entity Recognition
Identifying the named entities mentioned in text would enrich many semantic
applications at the downstream level. However, due to the predominant usage of
colloquial language in microblogs, the named entity recognition (NER) in
Chinese microblogs experience significant performance deterioration, compared
with performing NER in formal Chinese corpus. In this paper, we propose a
simple yet effective neural framework to derive the character-level embeddings
for NER in Chinese text, named ME-CNER. A character embedding is derived with
rich semantic information harnessed at multiple granularities, ranging from
radical, character to word levels. The experimental results demonstrate that
the proposed approach achieves a large performance improvement on Weibo dataset
and comparable performance on MSRA news dataset with lower computational cost
against the existing state-of-the-art alternatives.Comment: accepted at CIKM 201
LossâofâFunction Genetic Screening Identifies Aldolase A as an Essential Driver for Liver Cancer Cell Growth Under Hypoxia
Background and aims: Hypoxia is a common feature of the tumor microenvironment (TME), which promotes tumor progression, metastasis, and therapeutic drug resistance through a myriad of cell activities in tumor and stroma cells. While targeting hypoxic TME is emerging as a promising strategy for treating solid tumors, preclinical development of this approach is lacking in the study of HCC.
Approach and results: From a genome-wide CRISPR/CRISPR-associated 9 gene knockout screening, we identified aldolase A (ALDOA), a key enzyme in glycolysis and gluconeogenesis, as an essential driver for HCC cell growth under hypoxia. Knockdown of ALDOA in HCC cells leads to lactate depletion and consequently inhibits tumor growth. Supplementation with lactate partly rescues the inhibitory effects mediated by ALDOA knockdown. Upon hypoxia, ALDOA is induced by hypoxia-inducible factor-1α and fat mass and obesity-associated protein-mediated N6 -methyladenosine modification through transcriptional and posttranscriptional regulation, respectively. Analysis of The Cancer Genome Atlas shows that elevated levels of ALDOA are significantly correlated with poor prognosis of patients with HCC. In a screen of Food and Drug Administration-approved drugs based on structured hierarchical virtual platforms, we identified the sulfamonomethoxine derivative compound 5 (cpd-5) as a potential inhibitor to target ALDOA, evidenced by the antitumor activity of cpd-5 in preclinical patient-derived xenograft models of HCC.
Conclusions: Our work identifies ALDOA as an essential driver for HCC cell growth under hypoxia, and we demonstrate that inhibition of ALDOA in the hypoxic TME is a promising therapeutic strategy for treating HCC
RNA-binding protein RALY reprogrammes mitochondrial metabolism via mediating miRNA processing in colorectal cancer
Objective: Dysregulated cellular metabolism is a distinct hallmark of human colorectal cancer (CRC). However, metabolic programme rewiring during tumour progression has yet to be fully understood.
Design: We analysed altered gene signatures during colorectal tumour progression, and used a complex of molecular and metabolic assays to study the regulation of metabolism in CRC cell lines, human patient-derived xenograft mouse models and tumour organoid models.
Results: We identified a novel RNA-binding protein, RALY (also known as hnRNPCL2), that is highly associated with colorectal tumour aggressiveness. RALY acts as a key regulatory component in the Drosha complex, and promotes the post-transcriptional processing of a specific subset of miRNAs (miR-483, miR-676 and miR-877). These miRNAs systematically downregulate the expression of the metabolism-associated genes (ATP5I, ATP5G1, ATP5G3 and CYC1) and thereby reprogramme mitochondrial metabolism in the cancer cell. Analysis of The Cancer Genome Atlas (TCGA) reveals that increased levels of RALY are associated with poor prognosis in the patients with CRC expressing low levels of mitochondrion-associated genes. Mechanistically, induced processing of these miRNAs is facilitated by their N6-methyladenosine switch under reactive oxygen species (ROS) stress. Inhibition of the m6A methylation abolishes the RALY recognition of the terminal loop of the pri-miRNAs. Knockdown of RALY inhibits colorectal tumour growth and progression in vivo and in organoid models.
Conclusions: Collectively, our results reveal a critical metabolism-centric role of RALY in tumour progression, which may lead to cancer therapeutics targeting RALY for treating CRC
Spatiotemporal Genomic Profiling of Intestinal Metaplasia Reveals Clonal Dynamics of Gastric Cancer Progression
Intestinal metaplasia (IM) is a pre-malignant condition of the gastric mucosa associated with increased gastric cancer (GC) risk. Analyzing 1,256 gastric samples (1,152 IMs) across 692 subjects from a prospective 10-year study, we identify 26 IM driver genes in diverse pathways including chromatin regulation (ARID1A) and intestinal homeostasis (SOX9). Single-cell and spatial profiles highlight changes in tissue ecology and IM lineage heterogeneity, including an intestinal stem-cell dominant cellular compartment linked to early malignancy. Expanded transcriptome profiling reveals expression-based molecular subtypes of IM associated with incomplete histology, antral/intestinal cell types, ARID1A mutations, inflammation, and microbial communities normally associated with the healthy oral tract. We demonstrate that combined clinical-genomic models outperform clinical-only models in predicting IMs likely to transform to GC. By highlighting strategies for accurately identifying IM patients at high GC risk and a role for microbial dysbiosis in IM progression, our results raise opportunities for GC precision prevention and interception
Vaccination against type 1 angiotensin receptor prevents streptozotocin-induced diabetic nephropathy
STATISTICAL METHODS FOR CONTROLLING POPULATION STRATIFICATION AND GENE-BASED ASSOCIATION STUDIES
This dissertation includes three papers with each distributed in one chapter. In chapter 1, we use extensive simulation studies and real data studies to evaluate the performance of using the linkage disequilibrium score regression (LDSC) for controlling population stratification. In chapter 2, we propose a gene-based statistical method that leverage gene expression (GE) measurements and polygenic risk scores (PRS) to identify genes that are associated with a phenotype of interest. In simulation studies, the proposed method has correct type I error rates and can boost power comparing to other methods that use either gene expression or PRS in association tests. The real data analysis based on UK Biobank data for the asthma disease shows that the proposed method is also applicable to GWAS. In chapter 3, we analytically derive the distribution of TOW test statistics and modify TOW to utilize GWAS summary statistics (TOW-S). Simulation studies show that TOW-S has correct type I error rates and can retain power among all scenarios
- âŠ