847 research outputs found

    Complete genome sequence of an astrovirus identified in a domestic rabbit (\u3cem\u3eOryctolagus cuniculus\u3c/em\u3e) with gastroenteritis

    Get PDF
    A colony of domestic rabbits in Tennessee, USA, experienced a high-mortality (~90%) outbreak of enterocolitis. The clinical characteristics were one to six days of lethargy, bloating, and diarrhea, followed by death. Heavy intestinal coccidial load was a consistent finding as was mucoid enteropathy with cecal impaction. Preliminary analysis by electron microscopy revealed the presence of virus-like particles in the stool of one of the affected rabbits. Analysis using the Virochip, a viral detection microarray, suggested the presence of an astrovirus, and follow-up PCR and sequence determination revealed a previously uncharacterized member of that family. Metagenomic sequencing enabled the recovery of the complete viral genome, which contains the characteristic attributes of astrovirus genomes. Attempts to propagate the virus in tissue culture have yet to succeed. Although astroviruses cause gastroenteric disease in other mammals, the pathogenicity of this virus and the relationship to this outbreak remains to be determined. This study therefore defines a viral species and a potential rabbit pathogen

    eBay users form stable groups of common interest

    Full text link
    Market segmentation of an online auction site is studied by analyzing the users' bidding behavior. The distribution of user activity is investigated and a network of bidders connected by common interest in individual articles is constructed. The network's cluster structure corresponds to the main user groups according to common interest, exhibiting hierarchy and overlap. Key feature of the analysis is its independence of any similarity measure between the articles offered on eBay, as such a measure would only introduce bias in the analysis. Results are compared to null models based on random networks and clusters are validated and interpreted using the taxonomic classifications of eBay categories. We find clear-cut and coherent interest profiles for the bidders in each cluster. The interest profiles of bidder groups are compared to the classification of articles actually bought by these users during the time span 6-9 months after the initial grouping. The interest profiles discovered remain stable, indicating typical interest profiles in society. Our results show how network theory can be applied successfully to problems of market segmentation and sociological milieu studies with sparse, high dimensional data.Comment: Major revision of the manuscript. Methodological improvements and inclusion of analysis of temporal development of user interests. 19 pages, 12 figures, 5 table

    Euclidean Distances, soft and spectral Clustering on Weighted Graphs

    Get PDF
    We define a class of Euclidean distances on weighted graphs, enabling to perform thermodynamic soft graph clustering. The class can be constructed form the "raw coordinates" encountered in spectral clustering, and can be extended by means of higher-dimensional embeddings (Schoenberg transformations). Geographical flow data, properly conditioned, illustrate the procedure as well as visualization aspects.Comment: accepted for presentation (and further publication) at the ECML PKDD 2010 conferenc

    Applied immuno-epidemiological research: an approach for integrating existing knowledge into the statistical analysis of multiple immune markers.

    Get PDF
    BACKGROUND: Immunologists often measure several correlated immunological markers, such as concentrations of different cytokines produced by different immune cells and/or measured under different conditions, to draw insights from complex immunological mechanisms. Although there have been recent methodological efforts to improve the statistical analysis of immunological data, a framework is still needed for the simultaneous analysis of multiple, often correlated, immune markers. This framework would allow the immunologists' hypotheses about the underlying biological mechanisms to be integrated. RESULTS: We present an analytical approach for statistical analysis of correlated immune markers, such as those commonly collected in modern immuno-epidemiological studies. We demonstrate i) how to deal with interdependencies among multiple measurements of the same immune marker, ii) how to analyse association patterns among different markers, iii) how to aggregate different measures and/or markers to immunological summary scores, iv) how to model the inter-relationships among these scores, and v) how to use these scores in epidemiological association analyses. We illustrate the application of our approach to multiple cytokine measurements from 818 children enrolled in a large immuno-epidemiological study (SCAALA Salvador), which aimed to quantify the major immunological mechanisms underlying atopic diseases or asthma. We demonstrate how to aggregate systematically the information captured in multiple cytokine measurements to immunological summary scores aimed at reflecting the presumed underlying immunological mechanisms (Th1/Th2 balance and immune regulatory network). We show how these aggregated immune scores can be used as predictors in regression models with outcomes of immunological studies (e.g. specific IgE) and compare the results to those obtained by a traditional multivariate regression approach. CONCLUSION: The proposed analytical approach may be especially useful to quantify complex immune responses in immuno-epidemiological studies, where investigators examine the relationship among epidemiological patterns, immune response, and disease outcomes

    R-Gada: a fast and flexible pipeline for copy number analysis in association studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide association studies (GWAS) using Copy Number Variation (CNV) are becoming a central focus of genetic research. CNVs have successfully provided target genome regions for some disease conditions where simple genetic variation (i.e., SNPs) has previously failed to provide a clear association.</p> <p>Results</p> <p>Here we present a new R package, that integrates: (i) data import from most common formats of Affymetrix, Illumina and aCGH arrays; (ii) a fast and accurate segmentation algorithm to call CNVs based on Genome Alteration Detection Analysis (GADA); and (iii) functions for displaying and exporting the Copy Number calls, identification of recurrent CNVs, multivariate analysis of population structure, and tools for performing association studies. Using a large dataset containing 270 HapMap individuals (Affymetrix Human SNP Array 6.0 Sample Dataset) we demonstrate a flexible pipeline implemented with the package. It requires less than one minute per sample (3 million probe arrays) on a single core computer, and provides a flexible parallelization for very large datasets. Case-control data were generated from the HapMap dataset to demonstrate a GWAS analysis.</p> <p>Conclusions</p> <p>The package provides the tools for creating a complete integrated pipeline from data normalization to statistical association. It can effciently handle a massive volume of data consisting of millions of genetic markers and hundreds or thousands of samples with very accurate results.</p

    Fuzzy min-max neural networks for categorical data: application to missing data imputation

    Get PDF
    The fuzzy min–max neural network classifier is a supervised learning method. This classifier takes the hybrid neural networks and fuzzy systems approach. All input variables in the network are required to correspond to continuously valued variables, and this can be a significant constraint in many real-world situations where there are not only quantitative but also categorical data. The usual way of dealing with this type of variables is to replace the categorical by numerical values and treat them as if they were continuously valued. But this method, implicitly defines a possibly unsuitable metric for the categories. A number of different procedures have been proposed to tackle the problem. In this article, we present a new method. The procedure extends the fuzzy min–max neural network input to categorical variables by introducing new fuzzy sets, a new operation, and a new architecture. This provides for greater flexibility and wider application. The proposed method is then applied to missing data imputation in voting intention polls. The micro data—the set of the respondents’ individual answers to the questions—of this type of poll are especially suited for evaluating the method since they include a large number of numerical and categorical attributes
    corecore