267 research outputs found

    Representing complex data using localized principal components with application to astronomical data

    Full text link
    Often the relation between the variables constituting a multivariate data space might be characterized by one or more of the terms: ``nonlinear'', ``branched'', ``disconnected'', ``bended'', ``curved'', ``heterogeneous'', or, more general, ``complex''. In these cases, simple principal component analysis (PCA) as a tool for dimension reduction can fail badly. Of the many alternative approaches proposed so far, local approximations of PCA are among the most promising. This paper will give a short review of localized versions of PCA, focusing on local principal curves and local partitioning algorithms. Furthermore we discuss projections other than the local principal components. When performing local dimension reduction for regression or classification problems it is important to focus not only on the manifold structure of the covariates, but also on the response variable(s). Local principal components only achieve the former, whereas localized regression approaches concentrate on the latter. Local projection directions derived from the partial least squares (PLS) algorithm offer an interesting trade-off between these two objectives. We apply these methods to several real data sets. In particular, we consider simulated astrophysical data from the future Galactic survey mission Gaia.Comment: 25 pages. In "Principal Manifolds for Data Visualization and Dimension Reduction", A. Gorban, B. Kegl, D. Wunsch, and A. Zinovyev (eds), Lecture Notes in Computational Science and Engineering, Springer, 2007, pp. 180--204, http://www.springer.com/dal/home/generic/search/results?SGWID=1-40109-22-173750210-

    MaxMin Linear Initialization for Fuzzy C-Means

    Get PDF
    International audienceClustering is an extensive research area in data science. The aim of clustering is to discover groups and to identify interesting patterns in datasets. Crisp (hard) clustering considers that each data point belongs to one and only one cluster. However, it is inadequate as some data points may belong to several clusters, as is the case in text categorization. Thus, we need more flexible clustering. Fuzzy clustering methods, where each data point can belong to several clusters, are an interesting alternative. Yet, seeding iterative fuzzy algorithms to achieve high quality clustering is an issue. In this paper, we propose a new linear and efficient initialization algorithm MaxMin Linear to deal with this problem. Then, we validate our theoretical results through extensive experiments on a variety of numerical real-world and artificial datasets. We also test several validity indices, including a new validity index that we propose, Transformed Standardized Fuzzy Difference (TSFD)

    Software platform virtualization in chemistry research and university teaching

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Modern chemistry laboratories operate with a wide range of software applications under different operating systems, such as Windows, LINUX or Mac OS X. Instead of installing software on different computers it is possible to install those applications on a single computer using Virtual Machine software. Software platform virtualization allows a single guest operating system to execute multiple other operating systems on the same computer. We apply and discuss the use of virtual machines in chemistry research and teaching laboratories.</p> <p>Results</p> <p>Virtual machines are commonly used for cheminformatics software development and testing. Benchmarking multiple chemistry software packages we have confirmed that the computational speed penalty for using virtual machines is low and around 5% to 10%. Software virtualization in a teaching environment allows faster deployment and easy use of commercial and open source software in hands-on computer teaching labs.</p> <p>Conclusion</p> <p>Software virtualization in chemistry, mass spectrometry and cheminformatics is needed for software testing and development of software for different operating systems. In order to obtain maximum performance the virtualization software should be multi-core enabled and allow the use of multiprocessor configurations in the virtual machine environment. Server consolidation, by running multiple tasks and operating systems on a single physical machine, can lead to lower maintenance and hardware costs especially in small research labs. The use of virtual machines can prevent software virus infections and security breaches when used as a sandbox system for internet access and software testing. Complex software setups can be created with virtual machines and are easily deployed later to multiple computers for hands-on teaching classes. We discuss the popularity of bioinformatics compared to cheminformatics as well as the missing cheminformatics education at universities worldwide.</p

    Innate immunity glycoprotein gp-340 variants may modulate human susceptibility to dental caries

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Bacterial adhesion is an important determinant of colonization and infection, including dental caries. The salivary scavenger receptor cysteine-rich glycoprotein gp-340, which mediates adhesion of <it>Streptococcus mutans </it>(implicated in caries), harbours three major size variants, designated gp-340 I to III, each specific to an individual saliva. Here we have examined the association of the gp-340 I to III polymorphisms with caries experience and adhesion of <it>S. mutans</it>.</p> <p>Methods</p> <p>A case-referent study was performed in 12-year-old Swedish children with high (n = 19) or low (n = 19) caries experiences. We measured the gp-340 I to III saliva phenotypes and correlated those with multiple outcome measures for caries experience and saliva adhesion of <it>S. mutans </it>using the partial least squares (PLS) multivariate projection technique. In addition, we used traditional statistics and 2-year caries increment to verify the established PLS associations, and bacterial adhesion to purified gp-340 I to III proteins to support possible mechanisms.</p> <p>Results</p> <p>All except one subject were typed as gp-340 I to III (10, 23 and 4, respectively). The gp-340 I phenotype correlated positively with caries experience (VIP = 1.37) and saliva adhesion of <it>S. mutans </it>Ingbritt (VIP = 1.47). The gp-340 II and III phenotypes tended to behave in the opposite way. Moreover, the gp-340 I phenotype tended to show an increased 2-year caries increment compared to phenotypes II/III. Purified gp-340 I protein mediated markedly higher adhesion of <it>S. mutans </it>strains Ingbritt and NG8 and <it>Lactococcus lactis </it>expressing AgI/II adhesins (SpaP or PAc) compared to gp-340 II and III proteins. In addition, the gp-340 I protein appeared over represented in subjects positive for Db, an allelic acidic PRP variant associated with caries, and subjects positive for both gp-340 I and Db tended to experience more caries than those negative for both proteins.</p> <p>Conclusion</p> <p>Gp-340 I behaves as a caries susceptibility protein.</p

    Systemic Maternal Inflammation and Neonatal Hyperoxia Induces Remodeling and Left Ventricular Dysfunction in Mice

    Get PDF
    The impact of the neonatal environment on the development of adult cardiovascular disease is poorly understood. Systemic maternal inflammation is linked to growth retardation, preterm birth, and maturation deficits in the developing fetus. Often preterm or small-for-gestational age infants require medical interventions such as oxygen therapy. The long-term pathological consequences of medical interventions on an immature physiology remain unknown. In the present study, we hypothesized that systemic maternal inflammation and neonatal hyperoxia exposure compromise cardiac structure, resulting in LV dysfunction during adulthood.Pregnant C3H/HeN mice were injected on embryonic day 16 (E16) with LPS (80 µg/kg; i.p.) or saline. Offspring were placed in room air (RA) or 85% O(2) for 14 days and subsequently maintained in RA. Cardiac echocardiography, cardiomyocyte contractility, and molecular analyses were performed. Echocardiography revealed persistent lower left ventricular fractional shortening with greater left ventricular end systolic diameter at 8 weeks in LPS/O(2) than in saline/RA mice. Isolated cardiomyocytes from LPS/O(2) mice had slower rates of contraction and relaxation, and a slower return to baseline length than cardiomyocytes isolated from saline/RA controls. α-/β-MHC ratio was increased and Connexin-43 levels decreased in LPS/O(2) mice at 8 weeks. Nox4 was reduced between day 3 and 14 and capillary density was lower at 8 weeks of life in LPS/O(2) mice.These results demonstrate that systemic maternal inflammation combined with neonatal hyperoxia exposure induces alterations in cardiac structure and function leading to cardiac failure in adulthood and supports the importance of the intrauterine and neonatal milieu on adult health

    Systems-Level Modeling of Cancer-Fibroblast Interaction

    Get PDF
    Cancer cells interact with surrounding stromal fibroblasts during tumorigenesis, but the complex molecular rules that govern these interactions remain poorly understood thus hindering the development of therapeutic strategies to target cancer stroma. We have taken a mathematical approach to begin defining these rules by performing the first large-scale quantitative analysis of fibroblast effects on cancer cell proliferation across more than four hundred heterotypic cell line pairings. Systems-level modeling of this complex dataset using singular value decomposition revealed that normal tissue fibroblasts variably express at least two functionally distinct activities, one which reflects transcriptional programs associated with activated mesenchymal cells, that act either coordinately or at cross-purposes to modulate cancer cell proliferation. These findings suggest that quantitative approaches may prove useful for identifying organizational principles that govern complex heterotypic cell-cell interactions in cancer and other contexts

    Observation of Quantum Interference in Molecular Charge Transport

    Get PDF
    As the dimensions of a conductor approach the nano-scale, quantum effects will begin to dominate its behavior. This entails the exciting possibility of controlling the conductance of a device by direct manipulation of the electron wave function. Such control has been most clearly demonstrated in mesoscopic semiconductor structures at low temperatures. Indeed, the Aharanov-Bohm effect, conductance quantization and universal conductance fluctuations are direct manifestations of the electron wave nature. However, an extension of this concept to more practical emperatures has not been achieved so far. As molecules are nano-scale objects with typical energy level spacings (~eV) much larger than the thermal energy at 300 K (~25 meV), they are natural candidates to enable such a break-through. Fascinating phenomena including giant magnetoresistance, Kondo effects and conductance switching, have previously been demonstrated at the molecular level. Here, we report direct evidence for destructive quantum interference in charge transport through two-terminal molecular junctions at room temperature. Furthermore, we show that the degree of interference can be controlled by simple chemical modifications of the molecule. Not only does this provide the experimental demonstration of a new phenomenon in quantum charge transport, it also opens the road for a new type of molecular devices based on chemical or electrostatic control of quantum interference

    Improved ability of biological and previous caries multimarkers to predict caries disease as revealed by multivariate PLS modelling

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Dental caries is a chronic disease with plaque bacteria, diet and saliva modifying disease activity. Here we have used the PLS method to evaluate a multiplicity of such biological variables (n = 88) for ability to predict caries in a cross-sectional (baseline caries) and prospective (2-year caries development) setting.</p> <p>Methods</p> <p>Multivariate PLS modelling was used to associate the many biological variables with caries recorded in thirty 14-year-old children by measuring the numbers of incipient and manifest caries lesions at all surfaces.</p> <p>Results</p> <p>A wide but shallow gliding scale of one fifth caries promoting or protecting, and four fifths non-influential, variables occurred. The influential markers behaved in the order of plaque bacteria > diet > saliva, with previously known plaque bacteria/diet markers and a set of new protective diet markers. A differential variable patterning appeared for new versus progressing lesions. The influential biological multimarkers (n = 18) predicted baseline caries better (ROC area 0.96) than five markers (0.92) and a single lactobacilli marker (0.7) with sensitivity/specificity of 1.87, 1.78 and 1.13 at 1/3 of the subjects diagnosed sick, respectively. Moreover, biological multimarkers (n = 18) explained 2-year caries increment slightly better than reported before but predicted it poorly (ROC area 0.76). By contrast, multimarkers based on previous caries predicted alone (ROC area 0.88), or together with biological multimarkers (0.94), increment well with a sensitivity/specificity of 1.74 at 1/3 of the subjects diagnosed sick.</p> <p>Conclusion</p> <p>Multimarkers behave better than single-to-five markers but future multimarker strategies will require systematic searches for improved saliva and plaque bacteria markers.</p

    Predicting site index from climate and soil variables for cork oak (Quercus suber L.) stands in Portugal

    Get PDF
    Site productivity, assessed through site index, was modelled using partial least squares regression as a function of soil and climatic variables. Two alternative models were developed: a full model, considering all available explanatory variables, and a reduced model, considering only variables that can be obtained without digging a soil pit. The reduced model was used for mapping the site index distribution in Portugal, on the basis of existing digital cartography available for the whole country. The developed models indicate the importance of water availability and soil water holding capacity for site index value distribution. Site index was related to climate, namely evaporation and frost, and soil characteristics such as lithology, soil texture, soil depth, thickness of the A horizon and soil classification. The variability of the estimated values within the map (9.5–16.8 m with an average value of 13.4 m) reflects the impact of soil characteristics on the site productivity estimation. These variables should be taken into consideration during the establishment of new plantations of cork oak, and management of existing plantations. Results confirm the potential distribution of cork oak in coastal regions. They also suggest the existence of a considerable area, located both North and South of the Tagus river, where site indices values of medium (]13;15]) to high (]15;17]) productivity classes may be expected. The species is then expected to be able to have good productivity along the northern coastal areas of Portugal, where presently it is not a common species but where, according to historical records, it occurred until the middle of the sixteenth century. The present research focused on tree growth. Cork growth and cork quality distribution needs to be further researched through the establishment of long term experimental sites along the distribution area of cork oak, namely in the central and northern coastal areas of the countryinfo:eu-repo/semantics/publishedVersio
    • …
    corecore