9,296 research outputs found
Analyzing Multiple-Probe Microarray: Estimation and Application of Gene Expression Indexes
Gene expression index estimation is an essential step in analyzing multiple probe microarray data. Various modeling methods have been proposed in this area. Amidst all, a popular method proposed in Li and Wong (2001) is based on a multiplicative model, which is similar to the additive model discussed in Irizarry et al. (2003a) at the logarithm scale. Along this line, Hu et al. (2006) proposed data transformation to improve expression index estimation based on an ad hoc entropy criteria and naive grid search approach. In this work, we re-examined this problem using a new profile likelihood-based transformation estimation approach that is more statistically elegant and computationally efficient. We demonstrate the applicability of the proposed method using a benchmark Affymetrix U95A spiked-in experiment. Moreover, We introduced a new multivariate expression index and used the empirical study to shows its promise in terms of improving model fitting and power of detecting differential expression over the commonly used univariate expression index. As the other important content of the work, we discussed two generally encountered practical issues in application of gene expression index: normalization and summary statistic used for detecting differential expression. Our empirical study shows somewhat different findings from the MAQC project (MAQC, 2006)
Outliers in dynamic factor models
Dynamic factor models have a wide range of applications in econometrics and
applied economics. The basic motivation resides in their capability of reducing
a large set of time series to only few indicators (factors). If the number of
time series is large compared to the available number of observations then most
information may be conveyed to the factors. This way low dimension models may
be estimated for explaining and forecasting one or more time series of
interest. It is desirable that outlier free time series be available for
estimation. In practice, outlying observations are likely to arise at unknown
dates due, for instance, to external unusual events or gross data entry errors.
Several methods for outlier detection in time series are available. Most
methods, however, apply to univariate time series while even methods designed
for handling the multivariate framework do not include dynamic factor models
explicitly. A method for discovering outliers occurrences in a dynamic factor
model is introduced that is based on linear transforms of the observed data.
Some strategies to separate outliers that add to the model and outliers within
the common component are discussed. Applications to simulated and real data
sets are presented to check the effectiveness of the proposed method.Comment: Published in at http://dx.doi.org/10.1214/07-EJS082 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Viewpoints: A high-performance high-dimensional exploratory data analysis tool
Scientific data sets continue to increase in both size and complexity. In the
past, dedicated graphics systems at supercomputing centers were required to
visualize large data sets, but as the price of commodity graphics hardware has
dropped and its capability has increased, it is now possible, in principle, to
view large complex data sets on a single workstation. To do this in practice,
an investigator will need software that is written to take advantage of the
relevant graphics hardware. The Viewpoints visualization package described
herein is an example of such software. Viewpoints is an interactive tool for
exploratory visual analysis of large, high-dimensional (multivariate) data. It
leverages the capabilities of modern graphics boards (GPUs) to run on a single
workstation or laptop. Viewpoints is minimalist: it attempts to do a small set
of useful things very well (or at least very quickly) in comparison with
similar packages today. Its basic feature set includes linked scatter plots
with brushing, dynamic histograms, normalization and outlier detection/removal.
Viewpoints was originally designed for astrophysicists, but it has since been
used in a variety of fields that range from astronomy, quantum chemistry, fluid
dynamics, machine learning, bioinformatics, and finance to information
technology server log mining. In this article, we describe the Viewpoints
package and show examples of its usage.Comment: 18 pages, 3 figures, PASP in press, this version corresponds more
closely to that to be publishe
GLRT-based threshold detection-estimation performance improvement and application to uniform circular antenna arrays
©2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE."This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder."The problem of estimating the number of independent Gaussian sources and their parameters impinging upon an antenna array is addressed for scenarios that are problematic for standard techniques, namely, under "threshold conditions" (where subspace techniques such as MUSIC experience an abrupt and dramatic performance breakdown). We propose an antenna geometry-invariant method that adopts the generalized-likelihood-ratio test (GLRT) methodology, supported by a maximum-likelihood-ratio lower-bound analysis that allows erroneous solutions ("outliers") to be found and rectified. Detection-estimation performance in both uniform circular and linear antenna arrays is shown to be significantly improved compared with conventional techniques but limited by the performance-breakdown phenomenon that is intrinsic to all such maximum-likelihood (ML) techniques.Yuri I. Abramovich, Nicholas K. Spencer, and Alexei Y. Gorokho
Spectral Mapping Reconstruction of Extended Sources
Three dimensional spectroscopy of extended sources is typically performed
with dedicated integral field spectrographs. We describe a method of
reconstructing full spectral cubes, with two spatial and one spectral
dimension, from rastered spectral mapping observations employing a single slit
in a traditional slit spectrograph. When the background and image
characteristics are stable, as is often achieved in space, the use of
traditional long slits for integral field spectroscopy can substantially reduce
instrument complexity over dedicated integral field designs, without loss of
mapping efficiency -- particularly compelling when a long slit mode for single
unresolved source followup is separately required. We detail a custom
flux-conserving cube reconstruction algorithm, discuss issues of extended
source flux calibration, and describe CUBISM, a tool which implements these
methods for spectral maps obtained with ther Spitzer Space Telescope's Infrared
Spectrograph.Comment: 11 pages, 8 figures, accepted by PAS
Latest results of the Tunka Radio Extension (ISVHECRI2016)
The Tunka Radio Extension (Tunka-Rex) is an antenna array consisting of 63
antennas at the location of the TAIGA facility (Tunka Advanced Instrument for
cosmic ray physics and Gamma Astronomy) in Eastern Siberia, nearby Lake Baikal.
Tunka-Rex is triggered by the air-Cherenkov array Tunka-133 during clear and
moonless winter nights and by the scintillator array Tunka-Grande during the
remaining time. Tunka-Rex measures the radio emission from the same air-showers
as Tunka-133 and Tunka-Grande, but with a higher threshold of about 100 PeV.
During the first stages of its operation, Tunka-Rex has proven, that sparse
radio arrays can measure air-showers with an energy resolution of better than
15\% and the depth of the shower maximum with a resolution of better than 40
g/cm\textsuperscript{2}. To improve and interpret our measurements as well as
to study systematic uncertainties due to interaction models, we perform radio
simulations with CORSIKA and CoREAS. In this overview we present the setup of
Tunka-Rex, discuss the achieved results and the prospects of mass-composition
studies with radio arrays.Comment: proceedings of ISVHECRI2016 conferenc
Multivariate classification of gene expression microarray data
L'expressiódels gens obtinguts de l'anàliside microarrays s'utilitza en molts casos, per classificar les cèllules. En aquestatesi, unaversióprobabilística del mètodeDiscriminant Partial Least Squares (p-DPLS)s'utilitza per classificar les mostres de les expressions delsseus gens. p-DPLS esbasa en la regla de Bayes de la probabilitat a posteriori. Aquestsclassificadorssónforaçats a classficarsempre.Per superaraquestalimitaciós'haimplementatl'opció de rebuig.Aquestaopciópermetrebutjarlesmostresamb alt riscd'errors de classificació (és a dir, mostresambigüesi outliers).Aquestaopció de rebuigcombinacriterisbasats en els residuals x, el leverage ielsvalorspredits. A més,esdesenvolupa un mètode de selecció de variables per triarels gens mésrellevants, jaque la majoriadels gens analitzatsamb un microarraysónirrellevants per al propòsit particular de classificacióI podenconfondre el classificador. Finalment, el DPLSs'estenen a la classificació multi-classemitjançant la combinació de PLS ambl'anàlisidiscriminant lineal
- …