244 research outputs found

    Data compression and regression based on local principal curves.

    Get PDF
    Frequently the predictor space of a multivariate regression problem of the type y = m(x_1, 
, x_p ) + Δ is intrinsically one-dimensional, or at least of far lower dimension than p. Usual modeling attempts such as the additive model y = m_1(x_1) + 
 + m_p (x_p ) + Δ, which try to reduce the complexity of the regression problem by making additional structural assumptions, are then inefficient as they ignore the inherent structure of the predictor space and involve complicated model and variable selection stages. In a fundamentally different approach, one may consider first approximating the predictor space by a (usually nonlinear) curve passing through it, and then regressing the response only against the one-dimensional projections onto this curve. This entails the reduction from a p- to a one-dimensional regression problem. As a tool for the compression of the predictor space we apply local principal curves. Taking things on from the results presented in Einbeck et al. (Classification – The Ubiquitous Challenge. Springer, Heidelberg, 2005, pp. 256–263), we show how local principal curves can be parametrized and how the projections are obtained. The regression step can then be carried out using any nonparametric smoother. We illustrate the technique using data from the physical sciences

    Data compression and regression based on local principal curves

    Get PDF
    Frequently the predictor space of a multivariate regression problem of the type y = m(x_1, 
, x_p ) + Δ is intrinsically one-dimensional, or at least of far lower dimension than p. Usual modeling attempts such as the additive model y = m_1(x_1) + 
 + m_p (x_p ) + Δ, which try to reduce the complexity of the regression problem by making additional structural assumptions, are then inefficient as they ignore the inherent structure of the predictor space and involve complicated model and variable selection stages. In a fundamentally different approach, one may consider first approximating the predictor space by a (usually nonlinear) curve passing through it, and then regressing the response only against the one-dimensional projections onto this curve. This entails the reduction from a p- to a one-dimensional regression problem. As a tool for the compression of the predictor space we apply local principal curves. Taking things on from the results presented in Einbeck et al. (Classification – The Ubiquitous Challenge. Springer, Heidelberg, 2005, pp. 256–263), we show how local principal curves can be parametrized and how the projections are obtained. The regression step can then be carried out using any nonparametric smoother. We illustrate the technique using data from the physical sciences

    Characterization of clastic sedimentary enviroments by clustering algorithm and several statistical approaches — case study, Sava Depression in Northern Croatia

    Get PDF
    Abstract This study demonstrates a method to identify and characterize some facies of turbiditic depositional environments. The study area is a hydrocarbon field in the Sava Depression (Northern Croatia). Its Upper Miocene reservoirs have been proved to represent a lacustrine turbidite system. In the workflow, first an unsupervised neural network was applied as clustering method for two sandstone reservoirs. The elements of the input vectors were the basic petrophysical parameters. In the second step autocorrelation surfaces were used to reveal the hidden anisotropy of the grid. This anisotropy is supposed to identify the main continuity directions in the geometrical analyses of sandstone bodies. Finally, in the description of clusters several parametric and nonparametric statistics were used to characterize the identified facies. Obtained results correspond to the previously published interpretation of those reservoir facies

    Glycan shifting on hepatitis C virus (HCV) E2 glycoprotein is a mechanism for escape from broadly neutralizing antibodies

    Get PDF
    Hepatitis C virus (HCV) infection is a major cause of liver disease and hepatocellular carcinoma. Glycan shielding has been proposed to be a mechanism by which HCV masks broadly neutralizing epitopes on its viral glycoproteins. However, the role of altered glycosylation in HCV resistance to broadly neutralizing antibodies is not fully understood. Here, we have generated potent HCV neutralizing antibodies hu5B3.v3 and MRCT10.v362 that, similar to the previously described AP33 and HCV1, bind to a highly conserved linear epitope on E2. We utilize a combination of in vitro resistance selections using the cell culture infectious HCV and structural analyses to identify mechanisms of HCV resistance to hu5B3.v3 and MRCT10.v362. Ultra deep sequencing from in vitro HCV resistance selection studies identified resistance mutations at asparagine N417 (N417S, N417T and N417G) as early as 5 days post treatment. Comparison of the glycosylation status of soluble versions of the E2 glycoprotein containing the respective resistance mutations revealed a glycosylation shift from N417 to N415 in the N417S and N417T E2 proteins. The N417G E2 variant was glycosylated neither at residue 415 nor at residue 417 and remained sensitive to MRCT10.v362. Structural analyses of the E2 epitope bound to hu5B3.v3 Fab and MRCT10.v362 Fab using X-ray crystallography confirmed that residue N415 is buried within the antibody–peptide interface. Thus, in addition to previously described mutations at N415 that abrogate the ÎČ-hairpin structure of this E2 linear epitope, we identify a second escape mechanism, termed glycan shifting, that decreases the efficacy of broadly neutralizing HCV antibodies

    Altered spring phenology of North American freshwater turtles and the importance of representative populations

    Get PDF
    Globally, populations of diverse taxa have altered phenology in response to climate change. However, most research has focused on a single population of a given taxon, which may be unrepresentative for comparative analyses, and few long-term studies of phenology in ectothermic amniotes have been published. We test for climate- altered phenology using long-term studies (10–36 years) of nesting behavior in 14 populations representing six genera of freshwater turtles (Chelydra, Chrysemys, Kinosternon, Malaclemys, Sternotherus, and Trachemys). Nesting season initiation oc- curs earlier in more recent years, with 11 of the populations advancing phenology. The onset of nesting for nearly all populations correlated well with temperatures during the month preceding nesting. Still, certain populations of some species have not advanced phenology as might be expected from global patterns of climate change. This collection of findings suggests a proximate link between local climate and reproduction that is potentially caused by variation in spring emergence from hibernation, ability to process food, and thermoregulatory opportunities prior to nesting. However, even though all species had populations with at least some evi- dence of phenological advancement, geographic variation in phenology within and among turtle species underscores the critical importance of representative data for accurate comprehensive assessments of the biotic impacts of climate change

    A feature selection method for classification within functional genomics experiments based on the proportional overlapping score

    Get PDF
    Background: Microarray technology, as well as other functional genomics experiments, allow simultaneous measurements of thousands of genes within each sample. Both the prediction accuracy and interpretability of a classifier could be enhanced by performing the classification based only on selected discriminative genes. We propose a statistical method for selecting genes based on overlapping analysis of expression data across classes. This method results in a novel measure, called proportional overlapping score (POS), of a feature's relevance to a classification task.Results: We apply POS, along-with four widely used gene selection methods, to several benchmark gene expression datasets. The experimental results of classification error rates computed using the Random Forest, k Nearest Neighbor and Support Vector Machine classifiers show that POS achieves a better performance.Conclusions: A novel gene selection method, POS, is proposed. POS analyzes the expressions overlap across classes taking into account the proportions of overlapping samples. It robustly defines a mask for each gene that allows it to minimize the effect of expression outliers. The constructed masks along-with a novel gene score are exploited to produce the selected subset of genes
    • 

    corecore