38 research outputs found

    Simcluster: clustering enumeration gene expression data on the simplex space

    Get PDF
    Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space.

Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster.

Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data

    Spatial network morphology and social integration of the elderly: The socio-spatial ‘embeddedness’ of community-based elderly care facilities

    Get PDF
    Moving from the outskirts of cities into urban neighbourhoods, so called community-based elderly care facilities are regarded as a shift from a traditional medical model of care to a social model of care, with an aim of fostering social interactions between facility inhabitants and local residents. This strategy of achieving social integration through spatial integration involves spaces at multiple scales, including not only the interior environment of facilities, but also the exterior urban fabric surrounding facilities. However, most existing research focuses on the building interior of facilities. Local authorities tacitly assume that allocating facilities within an urban community means the realisation of spatial integration, hardly addressing the spatial complexity of urban communities from a morphological perspective, which results in contradictory findings with respect to the social outcomes of implementing such policies. Urban morphology can be a structural factor affording or eliminating opportunities of social interaction among inhabitants, which is particularly applicable to the ageing population, for whom social connections are largely realised via physical environments. Taking over 140 care facilities in the Chinese city of Nanjing as cases, this study develops a spatial network model to quantitatively identify the morphological patterns of urban communities in which facilities are located, thus considering the urban environment as an opportunity structure. It also disentangles to what extent facilities are connected or isolated from surrounding urban fabrics at various scales. Results show that being located within communities does not necessarily imply spatial embeddedness. Spatial network morphology may constrain social connection opportunities of facility inhabitants at global or local scales. Findings indicate that urban communities should not be regarded as spatially homogeneous entities when allocating care facilities. Differentiated morphological factors should be considered to optimise opportunities for social connection via spatial embeddedness

    Augmented Session Similarity Based Framework for Measuring Web User Concern from Web Server Logs

    Get PDF
    In this paper, an augmented sessions similarity based framework is proposed to measure web user concern from web server logs. This proposed framework will consider the best usage similarity between two web sessions based on accessed page relevance and URL based syntactic structure of website within the session. The proposed framework is implemented using K-medoids clustering algorithms with independent and combined similarity measures. The clusters qualities are evaluated by measuring average intra-cluster and inter-cluster distances. The experimental results show that combined augmented session dissimilarity metric outperformed the independent augmented session dissimilarity measures in terms of cluster validity measures

    A low dimensional model for bike sharing demand forecasting

    Full text link
    Big, transport-related datasets are nowadays publicly available, which makes data-driven mobility analysis possible. Trips with their origins, destinations and travel times are collected in publicly available big databases, which allows for a deeper and richer understanding of mobility patterns. This paper proposes a low dimensional approach to combine these data sources with weather data in order to forecast the daily demand for Bike Sharing Systems (BSS). The core of this approach lies in the proposed clustering technique, which reduces the dimension of the problem and, differently from other machine learning techniques, requires limited assumptions on the model or its parameters. The proposed clustering technique synthesizes mobility data quantitatively (number of trips) and spatially (mean trip origin and destination). This allows identifying recursive mobility patterns that - when combined with weather data - provide accurate predictions of the demand. The method is tested with real-world data from New York City. We synthesize more than four million trips into vectors of movement, which are then combined with weather data to forecast the daily demand at a city-level. Results show that, already with a one-parameters model, the proposed approach provides accurate predictions.Comment: 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

    Improving the family orientation process in Cuban Special Schools trough Nearest Prototype classification

    Get PDF
    Cuban Schools for children with Affective – Behavioral Maladies (SABM) have as goal to accomplish a major change in children behavior, to insert them effectively into society. One of the key elements in this objective is to give an adequate orientation to the children’s families; due to the family is one of the most important educational contexts in which the children will develop their personality. The family orientation process in SABM involves clustering and classification of mixed type data with non-symmetric similarity functions. To improve this process, this paper includes some novel characteristics in clustering and prototype selection. The proposed approach uses a hierarchical clustering based on compact sets, making it suitable for dealing with non-symmetric similarity functions, as well as with mixed and incomplete data. The proposal obtains very good results on the SABM data, and over repository databases

    Quality indices for (practical) clustering evaluation

    Get PDF
    WOS:000271584000004 (Nº de Acesso Web of Science)Clustering quality or validation indices allow the evaluation of the quality of clustering in order to support the selection of a specific partition or clustering structure in its natural unsupervised environment, where the real solution is unknown or not available. In this paper, we investigate the use of quality indices mostly based on the concepts of clusters' compactness and separation, for the evaluation of clustering results (partitions in particular). This work intends to offer a general perspective regarding the appropriate use of quality indices for the purpose of clustering evaluation. After presenting some commonly used indices, as well as indices recently proposed in the literature, key issues regarding the practical use of quality indices are addressed. A general methodological approach is presented which considers the identification of appropriate indices thresholds. This general approach is compared with the simple use of quality indices for evaluating a clustering solution

    Urban fabric and social participation in community-based elderly care facilities

    Get PDF
    Apart from providing essential care for elderly inhabitants, a well addressed purpose of community-based elderly care facilities is to promote social integration through encouraging visitors from the neighbourhood to continuously participate in activities and use services offered by the facility. The location of care facilities and their local environment have been widely argued to constitute a critical factor for older people’s continuous participation, which induces the formation and maintenance of personal networks between the different user groups, as well as a sense of attachment. However, existing literature on care facility location and older people’s participation predominantly uses qualitative methods, and often applied to a single case. This causes ambiguity and controversy when comparing findings from different cases and also makes the generalisation of study findings problematic. This paper introduces a spatial network model which based on Space Syntax theory to explicitly describe spatial relations between care facilities and urban fabric. With a large dataset of social participation records from 91 community-based elderly care facilities in the Chinese city of Nanjing, the study will investigate how differentiated locational properties exert influence on patterns of older people’s social participation. Findings indicate that local-scale spatial properties could influence occurrence patterns of social participation in care facilities, and the mechanism local-scale spatial properties exert influence varies in differentiated global-scale spatial contexts
    corecore