3,181 research outputs found

    Prediction of the Gene Expression in Normal Lung Tissue by the Gene Expression in Blood

    Get PDF
    Background: Comparative analysis of gene expression in human tissues is important for understanding the molecular mechanisms underlying tissue-specific control of gene expression. It can also open an avenue for using gene expression in blood (which is the most easily accessible human tissue) to predict gene expression in other (less accessible) tissues, which would facilitate the development of novel gene expression based models for assessing disease risk and progression. Until recently, direct comparative analysis across different tissues was not possible due to the scarcity of paired tissue samples from the same individuals. Methods: In this study we used paired whole blood/lung gene expression data from the Genotype-Tissue Expression (GTEx) project. We built a generalized linear regression model for each gene using gene expression in lung as the outcome and gene expression in blood, age and gender as predictors. Results: For ~18 % of the genes, gene expression in blood was a significant predictor of gene expression in lung. We found that the number of single nucleotide polymorphisms (SNPs) influencing expression of a given gene in either blood or lung, also known as the number of quantitative trait loci (eQTLs), was positively associated with efficacy of blood-based prediction of that gene’s expression in lung. This association was strongest for shared eQTLs: those influencing gene expression in both blood and lung. Conclusions: In conclusion, for a considerable number of human genes, their expression levels in lung can be predicted using observable gene expression in blood. An abundance of shared eQTLs may explain the strong blood/lung correlations in the gene expression

    Joint tests for quantitative trait loci in experimental crosses

    Get PDF
    Selective genotyping is common because it can increase the expected correlation between QTL genotype and phenotype and thus increase the statistical power of linkage tests (i.e., regression-based tests). Linkage can also be tested by assessing whether the marginal genotypic distribution conforms to its expectation, a marginal-based test. We developed a class of joint tests that, by constraining intercepts in regression-based analyses, capitalize on the information available in both regression-based and marginal-based tests. We simulated data corresponding to the null hypothesis of no QTL effect and the alternative of some QTL effect at the locus for a backcross and an F2 intercross between inbred strains. Regression-based and marginal-based tests were compared to corresponding joint tests. We studied the effects of random sampling, selective sampling from a single tail of the phenotypic distribution, and selective sampling from both tails of the phenotypic distribution. Joint tests were nearly as powerful as all competing alternatives for random sampling and two-tailed selection under both backcross and F2 intercross situations. Joint tests were generally more powerful for one-tailed selection under both backcross and F2 intercross situations. However, joint tests cannot be recommended for one-tailed selective genotyping if segregation distortion is suspected

    Renewal Strings for Cleaning Astronomical Databases

    Get PDF
    Large astronomical databases obtained from sky surveys such as the SuperCOSMOS Sky Surveys (SSS) invariably suffer from a small number of spurious records coming from artefactual effects of the telescope, satellites and junk objects in orbit around earth and physical defects on the photographic plate or CCD. Though relatively small in number these spurious records present a significant problem in many situations where they can become a large proportion of the records potentially of interest to a given astronomer. In this paper we focus on the four most common causes of unwanted records in the SSS: satellite or aeroplane tracks, scratches fibres and other linear phenomena introduced to the plate, circular halos around bright stars due to internal reflections within the telescope and diffraction spikes near to bright stars. Accurate and robust techniques are needed for locating and flagging such spurious objects. We have developed renewal strings, a probabilistic technique combining the Hough transform, renewal processes and hidden Markov models which have proven highly effective in this context. The methods are applied to the SSS data to develop a dataset of spurious object detections, along with confidence measures, which can allow this unwanted data to be removed from consideration. These methods are general and can be adapted to any future astronomical survey data.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003

    Incorporating multiple sets of eQTL weights into gene-by-environment interaction analysis identifies novel susceptibility loci for pancreatic cancer.

    Get PDF
    It is of great scientific interest to identify interactions between genetic variants and environmental exposures that may modify the risk of complex diseases. However, larger sample sizes are usually required to detect gene-by-environment interaction (G × E) than required to detect genetic main association effects. To boost the statistical power and improve the understanding of the underlying molecular mechanisms, we incorporate functional genomics information, specifically, expression quantitative trait loci (eQTLs), into a data-adaptive G × E test, called aGEw. This test adaptively chooses the best eQTL weights from multiple tissues and provides an extra layer of weighting at the genetic variant level. Extensive simulations show that the aGEw test can control the Type 1 error rate, and the power is resilient to the inclusion of neutral variants and noninformative external weights. We applied the proposed aGEw test to the Pancreatic Cancer Case-Control Consortium (discovery cohort of 3,585 cases and 3,482 controls) and the PanScan II genome-wide association study data (replication cohort of 2,021 cases and 2,105 controls) with smoking as the exposure of interest. Two novel putative smoking-related pancreatic cancer susceptibility genes, TRIP10 and KDM3A, were identified. The aGEw test is implemented in an R package aGE.We thank the two anonymous reviewers for their constructive comments. This research was supported by the National Institutes of Health (NIH) grant R01CA169122; P.W. was supported by NIH grants R01HL116720 and R21HL126032. S.H.O. was supported by NIH grant P30CA008748. R.E.N. and the Queensland Pancreatic Cancer Study were funded by the Australian National Health and Medical Research Council. The authors thank Ms. Jessica Swann and the National Institute of Statistical Sciences writing workshop for editorial assistance and suggestions. The authors acknowledge the Texas Advanced Computing Center at The University of Texas at Austin for providing computing resources. The authors alone are responsible for the views expressed in this article and they do not necessarily represent the views, decisions or policies of the institutions with which they are affiliated. The authors declare that there is no conflict of interest

    Skipper-in-CMOS: Non-Destructive Readout with Sub-Electron Noise Performance for Pixel Detectors

    Full text link
    The Skipper-in-CMOS image sensor integrates the non-destructive readout capability of Skipper Charge Coupled Devices (Skipper-CCDs) with the high conversion gain of a pinned photodiode in a CMOS imaging process, while taking advantage of in-pixel signal processing. This allows both single photon counting as well as high frame rate readout through highly parallel processing. The first results obtained from a 15 x 15 um^2 pixel cell of a Skipper-in-CMOS sensor fabricated in Tower Semiconductor's commercial 180 nm CMOS Image Sensor process are presented. Measurements confirm the expected reduction of the readout noise with the number of samples down to deep sub-electron noise of 0.15rms e-, demonstrating the charge transfer operation from the pinned photodiode and the single photon counting operation when the sensor is exposed to light. The article also discusses new testing strategies employed for its operation and characterization.Comment: 7 pages, 10 figure

    Transport in holographic superfluids

    Full text link
    We construct a slowly varying space-time dependent holographic superfluid and compute its transport coefficients. Our solution is presented as a series expansion in inverse powers of the charge of the order parameter. We find that the shear viscosity associated with the motion of the condensate vanishes. The diffusion coefficient of the superfluid is continuous across the phase transition while its third bulk viscosity is found to diverge at the critical temperature. As was previously shown, the ratio of the shear viscosity of the normal component to the entropy density is 1/(4 pi). As a consequence of our analysis we obtain an analytic expression for the backreacted metric near the phase transition for a particular type of holographic superfluid.Comment: 45 pages + appendice

    Theoretical Formulation of Principal Components Analysis to Detect and Correct for Population Stratification

    Get PDF
    The Eigenstrat method, based on principal components analysis (PCA), is commonly used both to quantify population relationships in population genetics and to correct for population stratification in genome-wide association studies. However, it can be difficult to make appropriate inference about population relationships from the principal component (PC) scatter plot. Here, to better understand the working mechanism of the Eigenstrat method, we consider its theoretical or “population” formulation. The eigen-equation for samples from an arbitrary number () of populations is reduced to that of a matrix of dimension , the elements of which are determined by the variance-covariance matrix for the random vector of the allele frequencies. Solving the reduced eigen-equation is numerically trivial and yields eigenvectors that are the axes of variation required for differentiating the populations. Using the reduced eigen-equation, we investigate the within-population fluctuations around the axes of variation on the PC scatter plot for simulated datasets. Specifically, we show that there exists an asymptotically stable pattern of the PC plot for large sample size. Our results provide theoretical guidance for interpreting the pattern of PC plot in terms of population relationships. For applications in genetic association tests, we demonstrate that, as a method of correcting for population stratification, regressing out the theoretical PCs corresponding to the axes of variation is equivalent to simply removing the population mean of allele counts and works as well as or better than the Eigenstrat method
    corecore