61,763 research outputs found

    Multi-view predictive partitioning in high dimensions

    Full text link
    Many modern data mining applications are concerned with the analysis of datasets in which the observations are described by paired high-dimensional vectorial representations or "views". Some typical examples can be found in web mining and genomics applications. In this article we present an algorithm for data clustering with multiple views, Multi-View Predictive Partitioning (MVPP), which relies on a novel criterion of predictive similarity between data points. We assume that, within each cluster, the dependence between multivariate views can be modelled by using a two-block partial least squares (TB-PLS) regression model, which performs dimensionality reduction and is particularly suitable for high-dimensional settings. The proposed MVPP algorithm partitions the data such that the within-cluster predictive ability between views is maximised. The proposed objective function depends on a measure of predictive influence of points under the TB-PLS model which has been derived as an extension of the PRESS statistic commonly used in ordinary least squares regression. Using simulated data, we compare the performance of MVPP to that of competing multi-view clustering methods which rely upon geometric structures of points, but ignore the predictive relationship between the two views. State-of-art results are obtained on benchmark web mining datasets.Comment: 31 pages, 12 figure

    Technology as tool to overcome barriers of using fitness facilities: A health behavioural perspective

    Get PDF
    Underlying health conditions have been highlighted throughout the literature preventing several populations from engaging in physical activity. There have been little to no attempts made in addressing these populations directly in fitness facilities or indirectly using information technology (IT). The current research aimed at exploring current barriers and practices regarding IT and technological support in a fitness facility environment, using health behaviour theories (HBT) to explain member experiences. The sample was composed of 66 participants selected from 5 fitness facilities in Manchester, UK, of which there were 60.6% males and 39.4% females aged from 18-59. The instrument used was a survey. Health motives were reported by 71.2% of the participants, while ‘injury’ (reported by 70.2%), ‘lack of knowledge about exercise and health’ (reported by 42.4%), and ‘illness’ (reported by 28.1%) as main barriers to use the facilities. The main support mechanisms provided by the facilities management were staff support (59%), with online and technological support only accounting for 38.6% of facility support. The use of personal IT within the facilities were utilised by over half the participants (50.2%). The study revealed the need of additional IT support by fitness facilities in the form of applications and digital platforms. The findings are discussed with HBT as the theoretical underpinnings and suggestions are made for future research regarding IT advancements as support mechanisms

    Sparse multi-view matrix factorisation: a multivariate approach to multiple tissue comparisons

    Full text link
    Gene expression levels in a population vary extensively across tissues. Such heterogeneity is caused by genetic variability and environmental factors, and is expected to be linked to disease development. The abundance of experimental data now enables the identification of features of gene expression profiles that are shared across tissues, and those that are tissue-specific. While most current research is concerned with characterising differential expression by comparing mean expression profiles across tissues, it is also believed that a significant difference in a gene expression's variance across tissues may also be associated to molecular mechanisms that are important for tissue development and function. We propose a sparse multi-view matrix factorisation (sMVMF) algorithm to jointly analyse gene expression measurements in multiple tissues, where each tissue provides a different "view" of the underlying organism. The proposed methodology can be interpreted as an extension of principal component analysis in that it provides the means to decompose the total sample variance in each tissue into the sum of two components: one capturing the variance that is shared across tissues, and one isolating the tissue-specific variances. sMVMF has been used to jointly model mRNA expression profiles in three tissues - adipose, skin and LCL - which are available for a large and well-phenotyped twins cohort, TwinsUK. Using sMVMF, we are able to prioritise genes based on whether their variation patterns are specific to each tissue. Furthermore, using DNA methylation profiles available, we provide supporting evidence that adipose-specific gene expression patterns may be driven by epigenetic effects.Comment: in Bioinformatics 201
    corecore