308 research outputs found

    Determining Principal Component Cardinality through the Principle of Minimum Description Length

    Full text link
    PCA (Principal Component Analysis) and its variants areubiquitous techniques for matrix dimension reduction and reduced-dimensionlatent-factor extraction. One significant challenge in using PCA, is thechoice of the number of principal components. The information-theoreticMDL (Minimum Description Length) principle gives objective compression-based criteria for model selection, but it is difficult to analytically applyits modern definition - NML (Normalized Maximum Likelihood) - to theproblem of PCA. This work shows a general reduction of NML prob-lems to lower-dimension problems. Applying this reduction, it boundsthe NML of PCA, by terms of the NML of linear regression, which areknown.Comment: LOD 201

    Representing complex data using localized principal components with application to astronomical data

    Full text link
    Often the relation between the variables constituting a multivariate data space might be characterized by one or more of the terms: ``nonlinear'', ``branched'', ``disconnected'', ``bended'', ``curved'', ``heterogeneous'', or, more general, ``complex''. In these cases, simple principal component analysis (PCA) as a tool for dimension reduction can fail badly. Of the many alternative approaches proposed so far, local approximations of PCA are among the most promising. This paper will give a short review of localized versions of PCA, focusing on local principal curves and local partitioning algorithms. Furthermore we discuss projections other than the local principal components. When performing local dimension reduction for regression or classification problems it is important to focus not only on the manifold structure of the covariates, but also on the response variable(s). Local principal components only achieve the former, whereas localized regression approaches concentrate on the latter. Local projection directions derived from the partial least squares (PLS) algorithm offer an interesting trade-off between these two objectives. We apply these methods to several real data sets. In particular, we consider simulated astrophysical data from the future Galactic survey mission Gaia.Comment: 25 pages. In "Principal Manifolds for Data Visualization and Dimension Reduction", A. Gorban, B. Kegl, D. Wunsch, and A. Zinovyev (eds), Lecture Notes in Computational Science and Engineering, Springer, 2007, pp. 180--204, http://www.springer.com/dal/home/generic/search/results?SGWID=1-40109-22-173750210-

    Platinum-group element geochemistry of the Paraná flood basalts – modelling metallogenesis in rifting continental plume environments

    Get PDF
    This is the author accepted manuscript. The final version is available on open access from Elsevier via the DOI in this recordThe 135 Ma Paraná-Etendeka Large Igneous Province (PELIP) is one of the largest areas of continental flood basalt (CFB) volcanism in the world and is widely agreed to be a product of intracontinental melts related to thermal anomalies from the Tristan mantle plume. The province rifted during the break-up of Gondwana, as the plume transitioned into an oceanic geodynamic environment. This study reports analyses of plume-derived basalts from the Brazilian side of the PELIP (the Serra Geral Group) to investigate major, trace and platinum-group element (PGE) abundances in an evolving plume-rift metallogenic setting, with the aim of contextualising metallogenic controls alongside existing magmatic interpretations of the region. The chalcophile geochemistry of these basalts defines three distinct metallogenic groupings that fit with three modern multi-element magma classifications for Serra Geral lavas. In this scheme, Type 4 lavas have a distinctive PGE-poor signature, Type 1 (Central-Northern) lavas are enriched in Pd, Au and Cu, and Type 1 (Southern) lavas are enriched in Ru and Rh. Our trace element melt modelling indicates that the compositional variations result from changes in the melting regime between the garnet and spinel stability fields, in response to the thinning and ‘unlidding’ of the rifting continent above. This process imposes progressively shallower melting depths and higher degrees of partial melting. Accordingly, Type 4 magmas formed from small degree melts, reducing the likelihood of sulfide exhaustion/chalcophile acquisition at source. Type 1 (Central-Northern) magmas incorporated components of the sub-continental lithospheric mantle (SCLM)-derived in higher-degree partial melts; the SCLM was heterogeneously enriched via metasomatism prior to plume melting, and this produced enrichment in volatile metals (Pd, Cu, and Au) in these magmas. In contrast, the Ru-Rh enrichment in Type 1 (Southern) lavas is attributed to increased spinel-group mineral and sulphide incorporation from the mantle into higher degree partial melts close to the continental rift zone. Our models confirm the importance of contributions from SCLM melts in precious metal mineral systems within CFB provinces, and reinforce the role of heterogeneous metasomatic enrichment underneath cratons in boosting intracontinental prospectivity with respect to ore deposits.University of Exete

    A spatial approach for the epidemiology of antibiotic use and resistance in community-based studies: the emergence of urban clusters of Escherichia coli quinolone resistance in Sao Paulo, Brasil

    Get PDF
    Copyright © Kiffer et al; licensee BioMed Central Ltd. 2011 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Background Population antimicrobial use may influence resistance emergence. Resistance is an ecological phenomenon due to potential transmissibility. We investigated spatial and temporal patterns of ciprofloxacin (CIP) population consumption related to E. coli resistance emergence and dissemination in a major Brazilian city. A total of 4,372 urinary tract infection E. coli cases, with 723 CIP resistant, were identified in 2002 from two outpatient centres. Cases were address geocoded in a digital map. Raw CIP consumption data was transformed into usage density in DDDs by CIP selling points influence zones determination. A stochastic model coupled with a Geographical Information System was applied for relating resistance and usage density and for detecting city areas of high/low resistance risk. Results E. coli CIP resistant cluster emergence was detected and significantly related to usage density at a level of 5 to 9 CIP DDDs. There were clustered hot-spots and a significant global spatial variation in the residual resistance risk after allowing for usage density. Conclusions There were clustered hot-spots and a significant global spatial variation in the residual resistance risk after allowing for usage density. The usage density of 5-9 CIP DDDs per 1,000 inhabitants within the same influence zone was the resistance triggering level. This level led to E. coli resistance clustering, proving that individual resistance emergence and dissemination was affected by antimicrobial population consumption

    The Endogenous Th17 Response in NO<inf>2</inf>-Promoted Allergic Airway Disease Is Dispensable for Airway Hyperresponsiveness and Distinct from Th17 Adoptive Transfer

    Get PDF
    Severe, glucocorticoid-resistant asthma comprises 5-7% of patients with asthma. IL-17 is a biomarker of severe asthma, and the adoptive transfer of Th17 cells in mice is sufficient to induce glucocorticoid-resistant allergic airway disease. Nitrogen dioxide (NO2) is an environmental toxin that correlates with asthma severity, exacerbation, and risk of adverse outcomes. Mice that are allergically sensitized to the antigen ovalbumin by exposure to NO2 exhibit a mixed Th2/Th17 adaptive immune response and eosinophil and neutrophil recruitment to the airway following antigen challenge, a phenotype reminiscent of severe clinical asthma. Because IL-1 receptor (IL-1R) signaling is critical in the generation of the Th17 response in vivo, we hypothesized that the IL-1R/Th17 axis contributes to pulmonary inflammation and airway hyperresponsiveness (AHR) in NO2-promoted allergic airway disease and manifests in glucocorticoid-resistant cytokine production. IL-17A neutralization at the time of antigen challenge or genetic deficiency in IL-1R resulted in decreased neutrophil recruitment to the airway following antigen challenge but did not protect against the development of AHR. Instead, IL-1R-/- mice developed exacerbated AHR compared to WT mice. Lung cells from NO2-allergically inflamed mice that were treated in vitro with dexamethasone (Dex) during antigen restimulation exhibited reduced Th17 cytokine production, whereas Th17 cytokine production by lung cells from recipient mice of in vitro Th17-polarized OTII T-cells was resistant to Dex. These results demonstrate that the IL-1R/Th17 axis does not contribute to AHR development in NO2-promoted allergic airway disease, that Th17 adoptive transfer does not necessarily reflect an endogenously-generated Th17 response, and that functions of Th17 responses are contingent on the experimental conditions in which they are generated. © 2013 Martin et al

    Genetic Determinants of Circulating Sphingolipid Concentrations in European Populations

    Get PDF
    Sphingolipids have essential roles as structural components of cell membranes and in cell signalling, and disruption of their metabolism causes several diseases, with diverse neurological, psychiatric, and metabolic consequences. Increasingly, variants within a few of the genes that encode enzymes involved in sphingolipid metabolism are being associated with complex disease phenotypes. Direct experimental evidence supports a role of specific sphingolipid species in several common complex chronic disease processes including atherosclerotic plaque formation, myocardial infarction (MI), cardiomyopathy, pancreatic beta-cell failure, insulin resistance, and type 2 diabetes mellitus. Therefore, sphingolipids represent novel and important intermediate phenotypes for genetic analysis, yet little is known about the major genetic variants that influence their circulating levels in the general population. We performed a genome-wide association study (GWAS) between 318,237 single-nucleotide polymorphisms (SNPs) and levels of circulating sphingomyelin (SM), dihydrosphingomyelin (Dih-SM), ceramide (Cer), and glucosylceramide (GluCer) single lipid species (33 traits); and 43 matched metabolite ratios measured in 4,400 subjects from five diverse European populations. Associated variants (32) in five genomic regions were identified with genome-wide significant corrected p-values ranging down to 9.08 x 10(-66). The strongest associations were observed in or near 7 genes functionally involved in ceramide biosynthesis and trafficking: SPTLC3, LASS4, SGPP1, ATP10D, and FADS1-3. Variants in 3 loci (ATP10D, FADS3, and SPTLC3) associate with MI in a series of three German MI studies. An additional 70 variants across 23 candidate genes involved in sphingolipid-metabolizing pathways also demonstrate association (p = 10(-4) or less). Circulating concentrations of several key components in sphingolipid metabolism are thus under strong genetic control, and variants in these loci can be tested for a role in the development of common cardiovascular, metabolic, neurological, and psychiatric diseases

    A Visual Data Mining Tool that Facilitates Reconstruction of Transcription Regulatory Networks

    Get PDF
    Background: Although the use of microarray technology has seen exponential growth, analysis of microarray data remains a challenge to many investigators. One difficulty lies in the interpretation of a list of differentially expressed genes, or in how to plan new experiments given that knowledge. Clustering methods can be used to identify groups of genes with similar expression patterns, and genes with unknown function can be provisionally annotated based on the concept of ‘‘guilt by association’’, where function is tentatively inferred from the known functions of genes with similar expression patterns. These methods frequently suffer from two limitations: (1) visualization usually only gives access to group membership, rather than specific information about nearest neighbors, and (2) the resolution or quality of the relationships are not easily inferred. Methodology/Principal Findings: We have addressed these issues by improving the precision of similarity detection over that of a single experiment and by creating a tool to visualize tractable association networks: we (1) performed metaanalysis computation of correlation coefficients for all gene pairs in a heterogeneous data set collected from 2,145 publicly available micorarray samples in mouse, (2) filtered the resulting distribution of over 130 million correlation coefficients to build new, more tractable distributions from the strongest correlations, and (3) designed and implemented a new Web based tool (StarNet

    Early detection of breast cancer based on gene-expression patterns in peripheral blood cells

    Get PDF
    INTRODUCTION: Existing methods to detect breast cancer in asymptomatic patients have limitations, and there is a need to develop more accurate and convenient methods. In this study, we investigated whether early detection of breast cancer is possible by analyzing gene-expression patterns in peripheral blood cells. METHODS: Using macroarrays and nearest-shrunken-centroid method, we analyzed the expression pattern of 1,368 genes in peripheral blood cells of 24 women with breast cancer and 32 women with no signs of this disease. The results were validated using a standard leave-one-out cross-validation approach. RESULTS: We identified a set of 37 genes that correctly predicted the diagnostic class in at least 82% of the samples. The majority of these genes had a decreased expression in samples from breast cancer patients, and predominantly encoded proteins implicated in ribosome production and translation control. In contrast, the expression of some defense-related genes was increased in samples from breast cancer patients. CONCLUSION: The results show that a blood-based gene-expression test can be developed to detect breast cancer early in asymptomatic patients. Additional studies with a large sample size, from women both with and without the disease, are warranted to confirm or refute this finding
    • …
    corecore