3,181 research outputs found
Prediction of the Gene Expression in Normal Lung Tissue by the Gene Expression in Blood
Background: Comparative analysis of gene expression in human tissues is important for understanding the molecular mechanisms underlying tissue-specific control of gene expression. It can also open an avenue for using gene expression in blood (which is the most easily accessible human tissue) to predict gene expression in other (less accessible) tissues, which would facilitate the development of novel gene expression based models for assessing disease risk and progression. Until recently, direct comparative analysis across different tissues was not possible due to the scarcity of paired tissue samples from the same individuals. Methods: In this study we used paired whole blood/lung gene expression data from the Genotype-Tissue Expression (GTEx) project. We built a generalized linear regression model for each gene using gene expression in lung as the outcome and gene expression in blood, age and gender as predictors. Results: For ~18 % of the genes, gene expression in blood was a significant predictor of gene expression in lung. We found that the number of single nucleotide polymorphisms (SNPs) influencing expression of a given gene in either blood or lung, also known as the number of quantitative trait loci (eQTLs), was positively associated with efficacy of blood-based prediction of that gene’s expression in lung. This association was strongest for shared eQTLs: those influencing gene expression in both blood and lung. Conclusions: In conclusion, for a considerable number of human genes, their expression levels in lung can be predicted using observable gene expression in blood. An abundance of shared eQTLs may explain the strong blood/lung correlations in the gene expression
Joint tests for quantitative trait loci in experimental crosses
Selective genotyping is common because it can increase the expected correlation between QTL genotype and phenotype and thus increase the statistical power of linkage tests (i.e., regression-based tests). Linkage can also be tested by assessing whether the marginal genotypic distribution conforms to its expectation, a marginal-based test. We developed a class of joint tests that, by constraining intercepts in regression-based analyses, capitalize on the information available in both regression-based and marginal-based tests. We simulated data corresponding to the null hypothesis of no QTL effect and the alternative of some QTL effect at the locus for a backcross and an F2 intercross between inbred strains. Regression-based and marginal-based tests were compared to corresponding joint tests. We studied the effects of random sampling, selective sampling from a single tail of the phenotypic distribution, and selective sampling from both tails of the phenotypic distribution. Joint tests were nearly as powerful as all competing alternatives for random sampling and two-tailed selection under both backcross and F2 intercross situations. Joint tests were generally more powerful for one-tailed selection under both backcross and F2 intercross situations. However, joint tests cannot be recommended for one-tailed selective genotyping if segregation distortion is suspected
Renewal Strings for Cleaning Astronomical Databases
Large astronomical databases obtained from sky surveys such as the
SuperCOSMOS Sky Surveys (SSS) invariably suffer from a small number of spurious
records coming from artefactual effects of the telescope, satellites and junk
objects in orbit around earth and physical defects on the photographic plate or
CCD. Though relatively small in number these spurious records present a
significant problem in many situations where they can become a large proportion
of the records potentially of interest to a given astronomer. In this paper we
focus on the four most common causes of unwanted records in the SSS: satellite
or aeroplane tracks, scratches fibres and other linear phenomena introduced to
the plate, circular halos around bright stars due to internal reflections
within the telescope and diffraction spikes near to bright stars. Accurate and
robust techniques are needed for locating and flagging such spurious objects.
We have developed renewal strings, a probabilistic technique combining the
Hough transform, renewal processes and hidden Markov models which have proven
highly effective in this context. The methods are applied to the SSS data to
develop a dataset of spurious object detections, along with confidence
measures, which can allow this unwanted data to be removed from consideration.
These methods are general and can be adapted to any future astronomical survey
data.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in
Artificial Intelligence (UAI2003
Incorporating multiple sets of eQTL weights into gene-by-environment interaction analysis identifies novel susceptibility loci for pancreatic cancer.
It is of great scientific interest to identify interactions between genetic variants and environmental exposures that may modify the risk of complex diseases. However, larger sample sizes are usually required to detect gene-by-environment interaction (G × E) than required to detect genetic main association effects. To boost the statistical power and improve the understanding of the underlying molecular mechanisms, we incorporate functional genomics information, specifically, expression quantitative trait loci (eQTLs), into a data-adaptive G × E test, called aGEw. This test adaptively chooses the best eQTL weights from multiple tissues and provides an extra layer of weighting at the genetic variant level. Extensive simulations show that the aGEw test can control the Type 1 error rate, and the power is resilient to the inclusion of neutral variants and noninformative external weights. We applied the proposed aGEw test to the Pancreatic Cancer Case-Control Consortium (discovery cohort of 3,585 cases and 3,482 controls) and the PanScan II genome-wide association study data (replication cohort of 2,021 cases and 2,105 controls) with smoking as the exposure of interest. Two novel putative smoking-related pancreatic cancer susceptibility genes, TRIP10 and KDM3A, were identified. The aGEw test is implemented in an R package aGE.We thank the two anonymous reviewers for their constructive comments. This research was supported
by the National Institutes of Health (NIH) grant R01CA169122; P.W. was supported by NIH
grants R01HL116720 and R21HL126032. S.H.O. was supported by NIH grant P30CA008748.
R.E.N. and the Queensland Pancreatic Cancer Study were funded by the Australian National
Health and Medical Research Council. The authors thank Ms. Jessica Swann and the National
Institute of Statistical Sciences writing workshop for editorial assistance and suggestions. The authors
acknowledge the Texas Advanced Computing Center at The University of Texas at Austin
for providing computing resources. The authors alone are responsible for the views expressed in
this article and they do not necessarily represent the views, decisions or policies of the institutions
with which they are affiliated. The authors declare that there is no conflict of interest
Skipper-in-CMOS: Non-Destructive Readout with Sub-Electron Noise Performance for Pixel Detectors
The Skipper-in-CMOS image sensor integrates the non-destructive readout
capability of Skipper Charge Coupled Devices (Skipper-CCDs) with the high
conversion gain of a pinned photodiode in a CMOS imaging process, while taking
advantage of in-pixel signal processing. This allows both single photon
counting as well as high frame rate readout through highly parallel processing.
The first results obtained from a 15 x 15 um^2 pixel cell of a Skipper-in-CMOS
sensor fabricated in Tower Semiconductor's commercial 180 nm CMOS Image Sensor
process are presented. Measurements confirm the expected reduction of the
readout noise with the number of samples down to deep sub-electron noise of
0.15rms e-, demonstrating the charge transfer operation from the pinned
photodiode and the single photon counting operation when the sensor is exposed
to light. The article also discusses new testing strategies employed for its
operation and characterization.Comment: 7 pages, 10 figure
Transport in holographic superfluids
We construct a slowly varying space-time dependent holographic superfluid and
compute its transport coefficients. Our solution is presented as a series
expansion in inverse powers of the charge of the order parameter. We find that
the shear viscosity associated with the motion of the condensate vanishes. The
diffusion coefficient of the superfluid is continuous across the phase
transition while its third bulk viscosity is found to diverge at the critical
temperature. As was previously shown, the ratio of the shear viscosity of the
normal component to the entropy density is 1/(4 pi). As a consequence of our
analysis we obtain an analytic expression for the backreacted metric near the
phase transition for a particular type of holographic superfluid.Comment: 45 pages + appendice
Theoretical Formulation of Principal Components Analysis to Detect and Correct for Population Stratification
The Eigenstrat method, based on principal components analysis (PCA), is commonly used both to quantify population relationships in population genetics and to correct for population stratification in genome-wide association studies. However, it can be difficult to make appropriate inference about population relationships from the principal component (PC) scatter plot. Here, to better understand the working mechanism of the Eigenstrat method, we consider its theoretical or “population” formulation. The eigen-equation for samples from an arbitrary number () of populations is reduced to that of a matrix of dimension , the elements of which are determined by the variance-covariance matrix for the random vector of the allele frequencies. Solving the reduced eigen-equation is numerically trivial and yields eigenvectors that are the axes of variation required for differentiating the populations. Using the reduced eigen-equation, we investigate the within-population fluctuations around the axes of variation on the PC scatter plot for simulated datasets. Specifically, we show that there exists an asymptotically stable pattern of the PC plot for large sample size. Our results provide theoretical guidance for interpreting the pattern of PC plot in terms of population relationships. For applications in genetic association tests, we demonstrate that, as a method of correcting for population stratification, regressing out the theoretical PCs corresponding to the axes of variation is equivalent to simply removing the population mean of allele counts and works as well as or better than the Eigenstrat method
- …