168 research outputs found

    Unsupervised feature selection for noisy data

    Get PDF
    Feature selection techniques are enormously applied in a variety of data analysis tasks in order to reduce the dimensionality. According to the type of learning, feature selection algorithms are categorized to: supervised or unsupervised. In unsupervised learning scenarios, selecting features is a much harder problem, due to the lack of class labels that would facilitate the search for relevant features. The selecting feature difficulty is amplified when the data is corrupted by different noises. Almost all traditional unsupervised feature selection methods are not robust against the noise in samples. These approaches do not have any explicit mechanism for detaching and isolating the noise thus they can not produce an optimal feature subset. In this article, we propose an unsupervised approach for feature selection on noisy data, called Robust Independent Feature Selection (RIFS). Specifically, we choose feature subset that contains most of the underlying information, using the same criteria as the Independent component analysis (ICA). Simultaneously, the noise is separated as an independent component. The isolation of representative noise samples is achieved using factor oblique rotation whereas noise identification is performed using factor pattern loadings. Extensive experimental results over divers real-life data sets have showed the efficiency and advantage of the proposed algorithm.We thankfully acknowledge the support of the Comision Interministerial de Ciencia y Tecnologa (CICYT) under contract No. TIN2015-65316-P which has partially funded this work.Peer ReviewedPostprint (author's final draft

    The projection score - an evaluation criterion for variable subset selection in PCA visualization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In many scientific domains, it is becoming increasingly common to collect high-dimensional data sets, often with an exploratory aim, to generate new and relevant hypotheses. The exploratory perspective often makes statistically guided visualization methods, such as Principal Component Analysis (PCA), the methods of choice. However, the clarity of the obtained visualizations, and thereby the potential to use them to formulate relevant hypotheses, may be confounded by the presence of the many non-informative variables. For microarray data, more easily interpretable visualizations are often obtained by filtering the variable set, for example by removing the variables with the smallest variances or by only including the variables most highly related to a specific response. The resulting visualization may depend heavily on the inclusion criterion, that is, effectively the number of retained variables. To our knowledge, there exists no objective method for determining the optimal inclusion criterion in the context of visualization.</p> <p>Results</p> <p>We present the projection score, which is a straightforward, intuitively appealing measure of the informativeness of a variable subset with respect to PCA visualization. This measure can be universally applied to find suitable inclusion criteria for any type of variable filtering. We apply the presented measure to find optimal variable subsets for different filtering methods in both microarray data sets and synthetic data sets. We note also that the projection score can be applied in general contexts, to compare the informativeness of any variable subsets with respect to visualization by PCA.</p> <p>Conclusions</p> <p>We conclude that the projection score provides an easily interpretable and universally applicable measure of the informativeness of a variable subset with respect to visualization by PCA, that can be used to systematically find the most interpretable PCA visualization in practical exploratory analysis.</p

    An affinity matured minibody for PET imaging of prostate stem cell antigen (PSCA)-expressing tumors

    Get PDF
    PurposeProstate stem cell antigen (PSCA), a cell surface glycoprotein expressed in normal human prostate and bladder, is over-expressed in the majority of localized prostate cancer and most bone metastases. We have previously shown that the hu1G8 minibody, a humanized anti-PSCA antibody fragment (single-chain Fv-C(H)3 dimer, 80 kDa), can localize specifically and image PSCA-expressing xenografts at 21 h post-injection. However, the humanization and antibody fragment reformatting decreased its apparent affinity. Here, we sought to evaluate PET imaging contrast with affinity matured minibodies.MethodsYeast scFv display, involving four rounds of selection, was used to generate the three affinity matured antibody fragments (A2, A11, and C5) that were reformatted into minibodies. These three affinity matured anti-PSCA minibodies were characterized in vitro, and following radiolabeling with (124)I were evaluated in vivo for microPET imaging of PSCA-expressing tumors.ResultsThe A2, A11, and C5 minibody variants all demonstrated improved affinity compared to the parental (P) minibody and were ranked as follows: A2 &gt; A11 &gt; C5 &gt; P. The (124)I-labeled A11 minibody demonstrated higher immunoreactivity than the parental minibody and also achieved the best microPET imaging contrast in two xenograft models, LAPC-9 (prostate cancer) and Capan-1 (pancreatic cancer), when evaluated in vivo.ConclusionOf the affinity variant minibodies tested, the A11 minibody that ranked second in affinity was selected as the best immunoPET tracer to image PSCA-expressing xenografts. This candidate is currently under development for evaluation in a pilot clinical imaging study

    Ocean and land forcing of the record-breaking Dust Bowl heat waves across central United States

    Get PDF
    International audienceThe severe drought of the 1930s Dust Bowl decade coincided with record-breaking summer heatwaves that contributed to the socioeconomic and ecological disaster over North America's Great Plains. It remains unresolved to what extent these exceptional heatwaves, hotter than in historically forced coupled climate model simulations, were forced by sea surface temperatures (SSTs) and exacerbated through human-induced deterioration of land cover. Here we show, using an atmospheric-only model, that anomalously warm North Atlantic SSTs enhance heatwave activity through an association with drier spring conditions resulting from weaker moisture transport. Model devegetation simulations, that represent the widespread exposure of bare soil in the 1930s, suggest human activity fueled stronger and more frequent heatwaves through greater evaporative drying in the warmer months. This study highlights the potential for the amplification of naturally occurring extreme events like droughts by vegetation feedbacks to create more extreme heatwaves in a warmer world

    Deformation analysis of a metropolis from C- to X-band PSI: proof-of-concept with Cosmo-Skymed over Rome, Italy

    Get PDF
    Stability of monuments and subsidence of residential quarters in Rome (Italy) are depicted based on geospatial analysis of more than 310,000 Persistent Scatterers (PS) obtained from Stanford Method for Persistent Scatterers (StaMPS) processing of 32 COSMO-SkyMed 3m-resolution HH StripMap ascending mode scenes acquired between 21 March 2011 and 10 June 2013. COSMO-SkyMed PS densities and associated displacement velocities are compared with almost 20 years of historical C-band ERS- 1/2, ENVISAT and RADARSAT-1/2 imagery. Accounting for differences in image processing algorithms and satellite acquisition geometries, we assess the feasibility of ground motion monitoring in big cities and metropolitan areas by coupling newly acquired and legacy SAR in full time series. Limitations and operational benefits of the transition from medium resolution C-band to high resolution X-band PS data are discussed, alongside the potential impact on the management of expanding urban environments
    corecore