25 research outputs found

    Joint modelling of multiple network wiews

    Get PDF
    Latent space models (LSM) for network data were introduced by Hoff et al. (2002) under the basic assumption that each node of the network has an unknown position in a D-dimensional Euclidean latent space: generally the smaller the distance between two nodes in the latent space, the greater their probability of being connected. In this paper we propose a variational inference approach to estimate the intractable posterior of the LSM. In many cases, different network views on the same set of nodes are available. It can therefore be useful to build a model able to jointly summarise the information given by all the network views. For this purpose, we introduce the latent space joint model (LSJM) that merges the information given by multiple network views assuming that the probability of a node being connected with other nodes in each network view is explained by a unique latent variable. This model is demonstrated on the analysis of two datasets: an excerpt of 50 girls from 'Teenage Friends and Lifestyle Study' data at three time points and the Saccharomyces cerevisiae genetic and physical protein-protein interactions

    GWmodel: an R package for exploring spatial heterogeneity using geographically weighted models

    Get PDF
    Spatial statistics is a growing discipline providing important analytical techniques in a wide range of disciplines in the natural and social sciences. In the R package GWmodel we present techniques from a particular branch of spatial statistics, termed geographically weighted (GW) models. GW models suit situations when data are not described well by some global model, but where there are spatial regions where a suitably localized calibration provides a better description. The approach uses a moving window weighting technique, where localized models are found at target locations. Outputs are mapped to provide a useful exploratory tool into the nature of the data spatial heterogeneity. Currently, GWmodel includes functions for: GW summary statistics, GW principal components analysis, GW regression, and GW discriminant analysis; some of which are provided in basic and robust forms

    GWmodel

    Get PDF
    In GWmodel, we introduce techniques from a particular branch of spatial statistics,termed geographically-weighted (GW) models. GW models suit situations when data are not described well by some global model, but where there are spatial regions where a suitably localised calibration provides a better description. GWmodel includes functions to calibrate: GW summary statistics, GW principal components analysis,GW discriminant analysis and various forms of GW regression; some of which are provided in basic and robust (outlier resistant) forms

    Mixture of latent trait analyzers for model-based clustering of categorical data

    Get PDF
    Model-based clustering methods for continuous data are well established and commonly used in a wide range of applications. However, model-based clustering methods for categorical data are less standard. Latent class analysis is a commonly used method for model-based clustering of binary data and/or categorical data, but due to an assumed local independence structure there may not be a correspondence between the estimated latent classes and groups in the population of interest. The mixture of latent trait analyzers model extends latent class analysis by assuming a model for the categorical response variables that depends on both a categorical latent class and a continuous latent trait variable; the discrete latent class accommodates group structure and the continuous latent trait accommodates dependence within these groups. Fitting the mixture of latent trait analyzers model is potentially difficult because the likelihood function involves an integral that cannot be evaluated analytically. We develop a variational approach for fitting the mixture of latent trait models and this provides an efficient model fitting strategy. The mixture of latent trait analyzers model is demonstrated on the analysis of data from the National Long Term Care Survey (NLTCS) and voting in the U.S. Congress. The model is shown to yield intuitive clustering results and it gives a much better fit than either latent class analysis or latent trait analysis alone

    Geographically weighted elastic net logistic regression

    Get PDF
    This paper develops a localized approach to elastic net logistic regression, extending previous research describing a localized elastic net as an extension to a localized ridge regression or a localized lasso. All such models have the objective to capture data relationships that vary across space. Geographically weighted elastic net logistic regression is first evaluated through a simulation experiment and shown to provide a robust approach for local model selection and alleviating local collinearity, before application to two case studies: county-level voting patterns in the 2016 USA presidential election, examining the spatial structure of socio-economic factors associated with voting for Trump, and a species presence–absence data set linked to explanatory environmental and climatic factors at gridded locations covering mainland USA. The approach is compared with other logistic regressions. It improves prediction for the election case study only which exhibits much greater spatial heterogeneity in the binary response than the species case study. Model comparisons show that standard geographically weighted logistic regression over-estimated relationship non-stationarity because it fails to adequately deal with collinearity and model selection. Results are discussed in the context of predictor variable collinearity and selection and the heterogeneities that were observed. Ongoing work is investigating locally derived elastic net parameters

    RNA-Seq analysis of bovine oocyte transcriptome reveals that differences between heifers and repeat breeders are limited to a few key transcripts

    No full text
    Maternal transcripts are accumulated during oocyte growth and drive early embryonic development; therefore, their characterisation is a relevant factor for predicting fertility. DNA microarrays have been the method of choice for transcriptional profiling, but this method has some limitations when applied to domestic species because it relies upon existing knowledge about genome sequence and offers a limited quantitative evaluation. These limits are overcome by next-generation sequencing technology. The aim of the work was to define a reference standard for bovine fertility determining the list and the level of transcripts stored in fully grown oocytes collected from heifers (H) and to compare this pattern with that of adult repeat breeders (RB). Oocytes were collected by ovum pick-up (OPU) from 5 Italian Dappled Red heifers of 11 to 15 months of age that became pregnant at the following oestrus and from 4 adult cows of the same breed with an age of 4 to 7 years, classified as repeat breeders after they failed to become pregnant for a minimum of 3 consecutive AI. In both groups, oocytes were aspirated from follicles of 4 to 6mm in diameter. Each oocyte was carefully denuded and immediately snap frozen in liquid nitrogen. Oocytes from each animal were pooled together (range 4 to 11) and analysed as a single sample. Total RNA extraction was performed by RNeasy Micro Kit (Qiagen, Valencia, CA, USA). Amplified cDNA, from both mRNA and non-polyadenylated transcripts, was prepared starting from total RNA using the Ovation RNA-Seq System V2 (NuGEN Inc., San Carlos, CA, USA). Purified cDNA was ligated directly into an Illumina sequencing library using TruSeq DNA Sample Prep kit (Illumina Inc., San Diego, CA, USA). Sequencing was performed on Illumina HiSEqn 2000 in the 50-bp long single-read set-up, at a 4-plex of multiplexing level, producing 30 to 40 million reads per sample. Data were annotated using the cDNA ENSEMBL UMD 3.1.67 database. On average, the number of transcripts present in each sample was 15438\ub1766 in H and 15624\ub1768 in RB oocytes. Nineteen thousand one hundred sixty-one transcripts were detected at least in one sample, and 12174 were detected in all samples. The comparison between H and RB showed that 598 transcripts out of 19161 (3.12%) and 437 out of 12174 (3.59%) are expressed at a significantly different level (P<0.05) in the 2 groups. Taking into consideration only the transcripts detected in all the samples, with an expression rate of at least 10-fold different and a P<0.05 we identified 39 genes. Seventeen transcripts were more abundant in RB oocytes, whereas 22 were downregulated. This is the first analysis of the oocyte transcriptome performed with deep sequencing technology. The method enabled us to compile a full list of transcripts that are found in highly competent oocytes. A direct comparison with low-quality oocytes indicated that quantitative differences of transcripts level are limited to a small subpopulation of key transcripts
    corecore