276 research outputs found
On Stein's Identity and Near-Optimal Estimation in High-dimensional Index Models
We consider estimating the parametric components of semi-parametric multiple
index models in a high-dimensional and non-Gaussian setting. Such models form a
rich class of non-linear models with applications to signal processing, machine
learning and statistics. Our estimators leverage the score function based first
and second-order Stein's identities and do not require the covariates to
satisfy Gaussian or elliptical symmetry assumptions common in the literature.
Moreover, to handle score functions and responses that are heavy-tailed, our
estimators are constructed via carefully thresholding their empirical
counterparts. We show that our estimator achieves near-optimal statistical rate
of convergence in several settings. We supplement our theoretical results via
simulation experiments that confirm the theory
Copy number variation at leptin receptor gene locus associated with metabolic traits and the risk of type 2 diabetes mellitus
<p>Abstract</p> <p>Background</p> <p>Recent efforts have been made to link complex human traits and disease susceptibility to DNA copy numbers. The leptin receptor (LEPR) has been implicated in obesity and diabetes. Mutations and genetic variations of <it>LEPR </it>gene have been discovered in rodents and humans. However, the association of DNA copy number variations at the <it>LEPR </it>gene locus with human complex diseases has not been reported. In an attempt to study DNA copy number variations associated with metabolic traits and type 2 diabetes mellitus (T2DM), we targeted the <it>LEPR </it>gene locus in DNA copy number analyses.</p> <p>Results</p> <p>We identified DNA copy number variations at the <it>LEPR </it>gene locus among a Korean population using genome-wide SNP chip data, and then quantified copy numbers of the E2 DNA sequence in the first two exons overlapped between <it>LEPR </it>and <it>LEPROT </it>genes by the quantitative multiplex PCR of short fluorescent fragment (QMPSF) method. Among the non-diabetic subjects (n = 1,067), lower E2 DNA copy numbers were associated with higher fasting glucose levels in men (<it>p </it>= 1.24 × 10<sup>-7</sup>) and women (<it>p </it>= 9.45 × 10<sup>-5</sup>), as well as higher total cholesterol levels in men (<it>p </it>= 9.96 × 10<sup>-7</sup>). In addition, the significant association between lower E2 DNA copy numbers and lower level of postprandial 2hr insulin was evident only in non-diabetic women, whereas some obesity-related phenotypes and total cholesterol level exhibited significant associations only in non-diabetic men. Logistic regression analysis indicated that lower E2 DNA copy numbers were associated with T2DM (odds ratio, 1.92; 95% CI, 1.26~2.96; p < 0.003) in our nested case-control study. Interestingly, the E2 DNA copy number exhibited a negative correlation with LEPR gene expression, but a positive correlation with LEPROT gene expression.</p> <p>Conclusions</p> <p>This work suggests that a structural variation at the <it>LEPR </it>gene locus is functionally associated with complex metabolic traits and the risk of T2DM.</p
HLAscan: genotyping of the HLA region using next-generation sequencing data
Background
Several recent studies showed that next-generation sequencing (NGS)-based human leukocyte antigen (HLA) typing is a feasible and promising technique for variant calling of highly polymorphic regions. To date, however, no method with sufficient read depth has completely solved the allele phasing issue. In this study, we developed a new method (HLAscan) for HLA genotyping using NGS data.
Results
HLAscan performs alignment of reads to HLA sequences from the international ImMunoGeneTics project/human leukocyte antigen (IMGT/HLA) database. The distribution of aligned reads was used to calculate a score function to determine correctly phased alleles by progressively removing false-positive alleles. Comparative HLA typing tests using public datasets from the 1000 Genomes Project and the International HapMap Project demonstrated that HLAscan could perform HLA typing more accurately than previously reported NGS-based methods such as HLAreporter and PHLAT. In addition, the results of HLA-A, −B, and -DRB1 typing by HLAscan using data generated by NextGen were identical to those obtained using a Sanger sequencing–based method. We also applied HLAscan to a family dataset with various coverage depths generated on the Illumina HiSeq X-TEN platform. HLAscan identified allele types of HLA-A, −B, −C, −DQB1, and -DRB1 with 100% accuracy for sequences at ≥ 90× depth, and the overall accuracy was 96.9%.
Conclusions
HLAscan, an alignment-based program that takes read distribution into account to determine true allele types, outperformed previously developed HLA typing tools. Therefore, HLAscan can be reliably applied for determination of HLA type across the whole-genome, exome, and target sequences
Genomic profile analysis of diffuse-type gastric cancers
Background: Stomach cancer is the third deadliest among all cancers worldwide. Although incidence of the intestinal-type gastric cancer has decreased, the incidence of diffuse-type is still increasing and its progression is notoriously aggressive. There is insufficient information on genome variations of diffuse-type gastric cancer because its cells are usually mixed with normal cells, and this low cellularity has made it difficult to analyze the genome.
Results: We analyze whole genomes and corresponding exomes of diffuse-type gastric cancer, using matched tumor and normal samples from 14 diffuse-type and five intestinal-type gastric cancer patients. Somatic variations found in the diffuse-type gastric cancer are compared to those of the intestinal-type and to previously reported variants. We determine the average exonic somatic mutation rate of the two types. We find associated candidate driver genes, and identify seven novel somatic mutations in CDH1, which is a well-known gastric cancer-associated gene. Three-dimensional structure analysis of the mutated E-cadherin protein suggests that these new somatic mutations could cause significant functional perturbations of critical calcium-binding sites in the EC1-2 junction. Chromosomal instability analysis shows that the MDM2 gene is amplified. After thorough structural analysis, a novel fusion gene TSC2-RNF216 is identified, which may simultaneously disrupt tumor-suppressive pathways and activate tumorigenesis.
Conclusions: We report the genomic profile of diffuse-type gastric cancers including new somatic variations, a novel fusion gene, and amplification and deletion of certain chromosomal regions that contain oncogenes and tumor suppressors.open121
Resistance to TGFβ suppression and improved anti-tumor responses in CD8+ T cells lacking PTPN22
Transforming growth factor β (TGFβ) is important in maintaining self-tolerance and inhibits T cell reactivity. We show that CD8⁺ T cells that lack the tyrosine phosphatase Ptpn22, a major predisposing gene for autoimmune disease, are resistant to the suppressive effects of TGFβ. Resistance to TGFβ suppression, while disadvantageous in autoimmunity, helps Ptpn22‾/‾ T cells to be intrinsically superior at clearing established tumors that secrete TGFβ. Mechanistically, loss of Ptpn22 increases the capacity of T cells to produce IL-2, which overcomes TGFβ-mediated suppression. These data suggest that a viable strategy to improve anti-tumor adoptive cell therapy may be to engineer tumor-restricted T cells with mutations identified as risk factors for autoimmunity
Implication of Genetic Variants Near TCF7L2, SLC30A8, HHEX, CDKAL1, CDKN2A/B, IGF2BP2, and FTO in Type 2 Diabetes and Obesity in 6,719 Asians
OBJECTIVE— Recent genome-wide association studies have identified six novel genes for type 2 diabetes and obesity and confirmed TCF7L2 as the major type 2 diabetes gene to date in Europeans. However, the implications of these genes in Asians are unclear
Gene Flow between the Korean Peninsula and Its Neighboring Countries
SNP markers provide the primary data for population structure analysis. In this study, we employed whole-genome autosomal SNPs as a marker set (54,836 SNP markers) and tested their possible effects on genetic ancestry using 320 subjects covering 24 regional groups including Northern ( = 16) and Southern ( = 3) Asians, Amerindians ( = 1), and four HapMap populations (YRI, CEU, JPT, and CHB). Additionally, we evaluated the effectiveness and robustness of 50K autosomal SNPs with various clustering methods, along with their dependencies on recombination hotspots (RH), linkage disequilibrium (LD), missing calls and regional specific markers. The RH- and LD-free multi-dimensional scaling (MDS) method showed a broad picture of human migration from Africa to North-East Asia on our genome map, supporting results from previous haploid DNA studies. Of the Asian groups, the East Asian group showed greater differentiation than the Northern and Southern Asian groups with respect to Fst statistics. By extension, the analysis of monomorphic markers implied that nine out of ten historical regions in South Korea, and Tokyo in Japan, showed signs of genetic drift caused by the later settlement of East Asia (South Korea, Japan and China), while Gyeongju in South East Korea showed signs of the earliest settlement in East Asia. In the genome map, the gene flow to the Korean Peninsula from its neighboring countries indicated that some genetic signals from Northern populations such as the Siberians and Mongolians still remain in the South East and West regions, while few signals remain from the early Southern lineages
- …