97 research outputs found

    Review of the Maximum Likelihood Functions for Right Censored Data. A New Elementary Derivation.

    Get PDF
    Censoring is a well known feature recurrent in the analysis of lifetime data, occurring in the model when exact lifetimes can be collected for only a representative portion of the surveyed individuals. If lifetimes are known only to exceed some given values, it is referred to as right censoring. In this paper we propose a systematization and a new derivation of the likelihood function for right censored sampling schemes; calculations are reported and assumptions are carefully stated. The sampling schemes considered (Type I, II and Random Censoring) give rise to the same ML function. Only the knowledge of elementary probability theory, namely the definitions of the order statistics and the conditional probability distribution function, are required in the proofs. Lastly we give an intuitive interpretation of Type I Censoring as a special case of Random Censoring, so that a global theory holds

    Crude Cumulative Incidence in the form of a Horvitz-Thompson like and Kaplan-Meier like Estimator

    Get PDF
    The link between the nonparametric estimator of the crude cumulative incidence of a competing risk and the Kaplan-Meier estimator is exploited. The equivalence of the nonparametric crude cumulative incidence to an inverse-probability-of-censoring weighted average of the sub-distribution function is proved. The link between the estimation of crude cumulative incidence curves and Gray\u27s family of nonparametric tests is considered. The crude cumulative incidence is proved to be a Kaplan-Meier like estimator based on the sub-distribution hazard, i.e. the quantity on which Gray\u27s family of tests is based. A standard probabilistic formalism is adopted to have a note accessible to applied statisticians

    Cell Polarity, Epithelial-Mesenchymal Transition, and Cell-Fate Decision Gene Expression in Ductal Carcinoma In Situ

    Get PDF
    Loss of epithelial cell identity and acquisition of mesenchymal features are early events in the neoplastic transformation of mammary cells. We investigated the pattern of expression of a selected panel of genes associated with cell polarity and apical junction complex or involved in TGF-β-mediated epithelial-mesenchymal transition and cell-fate decision in a series of DCIS and corresponding patient-matched normal tissue. Additionally, we compared DCIS gene profile with that of atypical ductal hyperplasia (ADH) from the same patient. Statistical analysis identified a “core” of genes differentially expressed in both precursors with respect to the corresponding normal tissue mainly associated with a terminally differentiated luminal estrogen-dependent phenotype, in agreement with the model according to which ER-positive invasive breast cancer derives from ER-positive progenitor cells, and with an autocrine production of estrogens through androgens conversion. Although preliminary, present findings provide transcriptomic confirmation that, at least for the panel of genes considered in present study, ADH and DCIS are part of a tumorigenic multistep process and strongly arise the necessity for the regulation, maybe using aromatase inhibitors, of the intratumoral and/or circulating concentration of biologically active androgens in DCIS patients to timely hamper abnormal estrogens production and block estrogen-induced cell proliferation

    Validation of Gene Expression Profiles in Genomic Data through Complementary Use of Cluster Analysis and PCA-Related Biplots

    Get PDF
    High-throughput genomic assays are used in molecular biology to explore patterns of joint expression of thousands of genes. These methodologies had relevant developments in the last decade, and concurrently there was a need for appropriate methods for analyzing the massive data generated. Identifying sets of genes and samples characterized by similar values of expression and validating these results are two critical issues related to these investigations because of their clinical implication. From a statistical perspective, unsupervised class discovery methods like Cluster Analysis are generally adopted. However, the use of Cluster Analysis mainly relies on the use of hierarchical techniques without considering possible use of other methods. This is partially due to software availability and to easiness of representation of results through a heatmap, which allows to simultaneously visualize clusterization of genes and samples on the same graphical device. One drawback of this strategy is that clusters' stability is often neglected, thus leading to over-interpretation of results. Moreover, validation of results using external datasets is still subject of discussion, since it is well known that batch effects may condition gene expression results even after normalization. In this paper we compared several clustering algorithms (hierarchical, k-means, model-based, Affinity Propagation) and stability indices to discover common patterns of expression and to assess clustering reliability, and propose a rank-based passive projection of Principal Components for validation purposes. Results from a study involving 23 tumor cell lines and 76 genes related to a specific biological pathway and derived from a publicly available dataset, are presented

    Cancer profiles by affinity propagation

    Get PDF
    The affinity propagation algorithm is applied to a problem of breast cancer subtyping using traditional biologic markers. The algorithm provides a procedure to determine the number of profiles to be considered. A well know breast cancer case series was used to compare the results of the affinity propagation with the results obtained with standard algorithms and indexes for the optimal choice of the number of clusters. Results from affinity propagation are consistent with the results already obtained having the advantage of providing an indication about the number of clusters

    Cancer profiles by Affinity Propagation

    Get PDF
    The Affinity Propagation algorithm is applied to various problems of breast and cutaneous tumours subtyping using traditional biologic markers. The algorithm provides a procedure to determine the number of profiles to be considered. Well know breast cancer case series and cutaneous melanoma were used to compare the results of the Affinity Propagation with the results obtained with standard algorithms and indexes for the optimal choice of the number of clusters.Results from Affinity Propagation are consistent with the results already obtained having the advantage of providing an indication about the number of clusters

    Clustering breast cancer data by consensus of different validity indices

    Get PDF
    Clustering algorithms will, in general, either partition a given data set into a pre-specified number of clusters or will produce a hierarchy of clusters. In this paper we analyse several different clustering techniques and apply them to a particular data set of breast cancer data. When we do not know a priori which is the best number of groups, we use a range of different validity indices to test the quality of clustering results and to determine the best number of clusters. While for the K-means method there is not absolute agreement among the indices as to which is the best number of clusters, for the PAM algorithm all the indices indicate 4 as the best cluster number

    Risk of Venous Thromboembolism in Patients Nursed at Home or in Long-Term Care Residential Facilities

    Get PDF
    Background. This study investigated the prevalence of and impact of risk factors for deep venous thrombosis (DVT) in patients with chronic diseases, bedridden or with greatly limited mobility, cared for at home or in long-term residential facilities. Methods. We enrolled 221 chronically ill patients, all over 18 years old, markedly or totally immobile, at home or in long-term care facilities. They were screened at the bedside by simplified compression ultrasound. Results. The prevalence of asymptomatic proximal DVT was 18% (95% CI 13–24%); there were no cases of symptomatic DVT or pulmonary embolism. The best model with at most four risk factors included: previous VTE, time of onset of reduced mobility, long-term residential care as opposed to home care and causes of reduced mobility. The risk of DVT for patients with reduced mobility due to cognitive impairment was about half that of patients with cognitive impairment/dementia. Conclusions. This is a first estimate of the prevalence of DVT among bedridden or low-mobility patients. Some of the risk factors that came to light, such as home care as opposed to long-term residential care and cognitive deficit as causes of reduced mobility, are not among those usually observed in acutely ill patients

    Clustering breast cancer data by consensus of different validity indices

    Get PDF
    Clustering algorithms will, in general, either partition a given data set into a pre-specified number of clusters or will produce a hierarchy of clusters. In this paper we analyse several different clustering techniques and apply them to a particular data set of breast cancer data. When we do not know a priori which is the best number of groups, we use a range of different validity indices to test the quality of clustering results and to determine the best number of clusters. While for the K-means method there is not absolute agreement among the indices as to which is the best number of clusters, for the PAM algorithm all the indices indicate 4 as the best cluster number
    corecore