827 research outputs found

    Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes

    Get PDF
    BACKGROUND: Commonly employed clustering methods for analysis of gene expression data do not directly incorporate phenotypic data about the samples. Furthermore, clustering of samples with known phenotypes is typically performed in an informal fashion. The inability of clustering algorithms to incorporate biological data in the grouping process can limit proper interpretation of the data and its underlying biology. RESULTS: We present a more formal approach, the modk-prototypes algorithm, for clustering biological samples based on simultaneously considering microarray gene expression data and classes of known phenotypic variables such as clinical chemistry evaluations and histopathologic observations. The strategy involves constructing an objective function with the sum of the squared Euclidean distances for numeric microarray and clinical chemistry data and simple matching for histopathology categorical values in order to measure dissimilarity of the samples. Separate weighting terms are used for microarray, clinical chemistry and histopathology measurements to control the influence of each data domain on the clustering of the samples. The dynamic validity index for numeric data was modified with a category utility measure for determining the number of clusters in the data sets. A cluster's prototype, formed from the mean of the values for numeric features and the mode of the categorical values of all the samples in the group, is representative of the phenotype of the cluster members. The approach is shown to work well with a simulated mixed data set and two real data examples containing numeric and categorical data types. One from a heart disease study and another from acetaminophen (an analgesic) exposure in rat liver that causes centrilobular necrosis. CONCLUSION: The modk-prototypes algorithm partitioned the simulated data into clusters with samples in their respective class group and the heart disease samples into two groups (sick and buff denoting samples having pain type representative of angina and non-angina respectively) with an accuracy of 79%. This is on par with, or better than, the assignment accuracy of the heart disease samples by several well-known and successful clustering algorithms. Following modk-prototypes clustering of the acetaminophen-exposed samples, informative genes from the cluster prototypes were identified that are descriptive of, and phenotypically anchored to, levels of necrosis of the centrilobular region of the rat liver. The biological processes cell growth and/or maintenance, amine metabolism, and stress response were shown to discern between no and moderate levels of acetaminophen-induced centrilobular necrosis. The use of well-known and traditional measurements directly in the clustering provides some guarantee that the resulting clusters will be meaningfully interpretable

    Double impact: what sibling data can tell us about the long-term negative effects of parental divorce

    Get PDF
    Journal ArticleMost prior research on the adverse consequences of parental divorce has analyzed only one child per family. As a result, it is not known whether the same divorce affects siblings differently. We address this issue by analyzing paired sibling data from the 1994 General Social Survey (GSS) and 1994 Survey of American Families (SAF). Both seemingly unrelated regressions and random effects models are used to study the effect of family background on offspring's educational attainment and marital stability. Parental divorce adversely affects the educational attainment and the probability of divorce of both children within a sibship; in other words, siblings tend to experience the same divorce the same way. However, family structure of origin only accounts for a trivial portion of the shared variance in offspring's educational attainment and marital stability, so parental divorce is only one of many factors determining how offspring fare. These findings were unchanged when controlling for a number of differences both between and within sibships. Also, the negative effects of parental divorce largely do not vary according to respondent characteristics

    Spatial Effects of the Social Marketing of Insecticide-Treated Nets on Malaria Morbidity.

    Get PDF
    Randomized controlled trials have shown that insecticide-treated nets (ITNs) have an impact on both malaria morbidity and mortality. Uniformly high coverage of ITNs characterized these trials and this resulted in some protection of nearby non-users of ITNs. We have now assessed the coverage, distribution pattern and resultant spatial effects in one village in Tanzania where ITNs were distributed in a social marketing programme. The prevalence of parasitaemia, mild anaemia (Hb <11 g/dl) and moderate/severe anaemia (Hb <8 g/dl) in children under five was assessed cross-sectionally. Data on ownership of ITNs were collected and inhabitants' houses were mapped. One year after the start of the social marketing programme, 52% of the children were using a net which had been treated at least once. The ITNs were rather homogeneously distributed throughout the village at an average density of about 118 ITNs per thousand population. There was no evidence of a pattern in the distribution of parasitaemia and anaemia cases, but children living in areas of moderately high ITN coverage were about half as likely to have moderate/severe anaemia (OR 0.5, 95% CI: 0.2, 0.9) and had lower prevalence of splenomegaly, irrespective of their net use. No protective effects of coverage were found for prevalence of mild anaemia nor for parasitaemia. The use of untreated nets had neither coverage nor short distance effects. More efforts should be made to ensure high coverage in ITNs programmes to achieve maximum benefit

    Fractal geometry of spin-glass models

    Full text link
    Stability and diversity are two key properties that living entities share with spin glasses, where they are manifested through the breaking of the phase space into many valleys or local minima connected by saddle points. The topology of the phase space can be conveniently condensed into a tree structure, akin to the biological phylogenetic trees, whose tips are the local minima and internal nodes are the lowest-energy saddles connecting those minima. For the infinite-range Ising spin glass with p-spin interactions, we show that the average size-frequency distribution of saddles obeys a power law wD \sim w^{-D}, where w=w(s) is the number of minima that can be connected through saddle s, and D is the fractal dimension of the phase space

    Non-compartment model to compartment model pharmacokinetics transformation meta-analysis – a multivariate nonlinear mixed model

    Get PDF
    Background To fulfill the model based drug development, the very first step is usually a model establishment from published literatures. Pharmacokinetics model is the central piece of model based drug development. This paper proposed an important approach to transform published non-compartment model pharmacokinetics (PK) parameters into compartment model PK parameters. This meta-analysis was performed with a multivariate nonlinear mixed model. A conditional first-order linearization approach was developed for statistical estimation and inference. Results Using MDZ as an example, we showed that this approach successfully transformed 6 non-compartment model PK parameters from 10 publications into 5 compartment model PK parameters. In simulation studies, we showed that this multivariate nonlinear mixed model had little relative bias (<1%) in estimating compartment model PK parameters if all non-compartment PK parameters were reported in every study. If there missing non-compartment PK parameters existed in some published literatures, the relative bias of compartment model PK parameter was still small (<3%). The 95% coverage probabilities of these PK parameter estimates were above 85%. Conclusions This non-compartment model PK parameter transformation into compartment model meta-analysis approach possesses valid statistical inference. It can be routinely used for model based drug development

    A stitch in time: Efficient computation of genomic DNA melting bubbles

    Get PDF
    Background: It is of biological interest to make genome-wide predictions of the locations of DNA melting bubbles using statistical mechanics models. Computationally, this poses the challenge that a generic search through all combinations of bubble starts and ends is quadratic. Results: An efficient algorithm is described, which shows that the time complexity of the task is O(NlogN) rather than quadratic. The algorithm exploits that bubble lengths may be limited, but without a prior assumption of a maximal bubble length. No approximations, such as windowing, have been introduced to reduce the time complexity. More than just finding the bubbles, the algorithm produces a stitch profile, which is a probabilistic graphical model of bubbles and helical regions. The algorithm applies a probability peak finding method based on a hierarchical analysis of the energy barriers in the Poland-Scheraga model. Conclusions: Exact and fast computation of genomic stitch profiles is thus feasible. Sequences of several megabases have been computed, only limited by computer memory. Possible applications are the genome-wide comparisons of bubbles with promotors, TSS, viral integration sites, and other melting-related regions.Comment: 16 pages, 10 figure
    corecore