102 research outputs found

    Three–way compositional data: a multi–stage trilinear decomposition algorithm

    Get PDF
    The CANDECOMP/PARAFAC model is an extension of bilinear PCA and has been designed to model three-way data by preserving their multidimensional configuration. The Alternating Least Squares (ALS) procedure is the preferred estimating algorithm for this model because it guarantees stable results. It can, however, be slow at converging and sensitive to collinearity and over-factoring. Dealing with these issues is even more pressing when data are compositional and thus collinear by definition. In this talk the solution proposed is based on a multistage approach. Here parameters are optimized with procedures that work better for collinearity and over-factoring, namely ATLD and SWATLD, and then results are refined with ALS

    Analysis of Sentinel Node Biopsy and Clinicopathologic Features as Prognostic Factors in Patients With Atypical Melanocytic Tumors.

    Get PDF
    BACKGROUND: Atypical melanocytic tumors (AMTs) include a wide spectrum of melanocytic neoplasms that represent a challenge for clinicians due to the lack of a definitive diagnosis and the related uncertainty about their management. This study analyzed clinicopathologic features and sentinel node status as potential prognostic factors in patients with AMTs. PATIENTS AND METHODS: Clinicopathologic and follow-up data of 238 children, adolescents, and adults with histologically proved AMTs consecutively treated at 12 European centers from 2000 through 2010 were retrieved from prospectively maintained databases. The binary association between all investigated covariates was studied by evaluating the Spearman correlation coefficients, and the association between progression-free survival and all investigated covariates was evaluated using univariable Cox models. The overall survival and progression-free survival curves were established using the Kaplan-Meier method. RESULTS: Median follow-up was 126 months (interquartile range, 104-157 months). All patients received an initial diagnostic biopsy followed by wide (1 cm) excision. Sentinel node biopsy was performed in 139 patients (58.4%), 37 (26.6%) of whom had sentinel node positivity. There were 4 local recurrences, 43 regional relapses, and 8 distant metastases as first events. Six patients (2.5%) died of disease progression. Five patients who were sentinel node-negative and 3 patients who were sentinel node-positive developed distant metastases. Ten-year overall and progression-free survival rates were 97% (95% CI, 94.9%-99.2%) and 82.2% (95% CI, 77.3%-87.3%), respectively. Age, mitotic rate/mm2, mitoses at the base of the lesion, lymphovascular invasion, and 9p21 loss were factors affecting prognosis in the whole series and the sentinel node biopsy subgroup. CONCLUSIONS: Age >20 years, mitotic rate >4/mm2, mitoses at the base of the lesion, lymphovascular invasion, and 9p21 loss proved to be worse prognostic factors in patients with ATMs. Sentinel node status was not a clear prognostic predictor

    An ATLD–ALS method for the trilinear decomposition of large third-order tensors

    No full text
    CP decomposition of large third-order tensors can be computationally challenging. Parameters are typically estimated by means of the ALS procedure because it yields least-squares solutions and provides consistent outcomes. Nevertheless, ALS presents two major flaws which are particularly problematic for large-scale problems: slow convergence and sensitiveness to degeneracy conditions such as over-factoring, collinearity, bad initialization and local minima. More efficient algorithms have been proposed in the literature. They are, however, much less dependable than ALS in delivering stable results because the increased speed often comes at the expense of accuracy. In particular, the ATLD procedure is one of the fastest alternatives, but it is hardly employed because of the unreliable nature of its convergence. As a solution, multi-optimization is proposed. ATLD and ALS steps are concatenated in an integrated procedure with the purpose of increasing efficiency without a significant loss in precision. This methodology has been implemented and tested under realistic conditions on simulated data sets

    A procedure for the three-way analysis of compositions

    No full text
    The Tucker3 model is one of the most widely used tools for factorial analysis of three-way data arrays. When orthogonal factors are extracted this model can be seen as a three-way PCA (principal component analysis). The Tucker3 model is characterized by extreme flexibility as it allows for the use of a different number of factors in each mode and it yields non-unique results. When this model is applied to vectors of non-negative values with a sum constraint all problems connected with the statistical analysis of compositions must be taken into consideration. Like other standard statistical techniques, this model cannot be directly applied. The aim of this paper is to present the theory behind the Tucker3 model on compositional data and to describe the TUCKALS3 algorithm

    A procedure for the three-mode analysis of compositions

    No full text
    The Tucker3 model is one of the most widely used tools for factorial analysis of three-way data arrays. When orthogonal factors are extracted this model can be seen as a three-way PCA (principal component analysis). The Tucker3 model is characterized by extreme flexibility as it allows for the use of a different number of factors in each mode and it yields non-unique results. This adaptability makes the Tucker3 model extremely effective for decomposition and compression of data in many applications and fields. When this model is applied to vectors of non-negative values with a sum constraint all problems connected with the statistical analysis of compositions must be taken into consideration. Like other standard statistical techniques, this model cannot be directly applied. The aim of this paper is to present the theory behind the correct application of the Tucker3 model on compositional data and to describe the TUCKALS3 algorithm

    Detecting public social spending patterns in Italy using a three-way relative variation approach

    No full text
    Studies on public social spending often fail to address the issues connected with budgetary constraints. Budget lines require public entities to partition resources among sectors of spending on the basis of preferred combinations and trade-offs. Standard exploratory tools do not allow to unveil this preference structure as they are hindered by the differences in budget scales and by the bounded nature of sector variability, i.e. an increase in one sector means a missed increase or a decrease in other sectors. In this work Italian public social spending is modeled with an alternative log-ratio methodology which allows to study relative variation patterns among sectors. It is also important to note that since the data is collected across time a three-way approach is recommended so that the variability of each mode is kept separate

    Improving PARAFAC-ALS performance by initialization

    No full text
    The CANDECOMP/PARAFAC (CP) model (Carroll and Chang, 1970; Harshman, 1970) is a trilinear decomposition which provides a low rank approximation of a three-way array in a manner that preserves the multi-mode structure of the data. This is achieved by estimating three sets of parameters, one for each dimension of the array, namely observation units, variables and occasions. The CP model, however, due to an elevated number of degrees of freedom, can be quite challenging to estimate. The most commonly used algorithm to t this model to the data is PARAFAC-ALS. Comparative studies (Tomasi and Bro, 2006) have shown that this procedure is, in general, more reliable and accurate than other algorithms proposed in the literature. Nonetheless, it presents some non-trivial issues: it can be slow at converging and may run into over-factoring and bad initialization degeneracies. With respect to these setbacks, some of the alternative estimating procedures are able to perform better than ALS, specically the Alternating Trilinear Decomposition (ATLD) and Self-weighted Alternating Trilin-ear Decomposition (SWATLD) proposed by Wu et al. (1998) and Chen et al. (2000) respectively. These algorithms are faster and less likely to be aected by over-factoring and bad initial values. They present, however, diculties connected to their non-least squares objective functions and for this reason they are seldom used in practice. In this work it is suggested that a successful way to improve on ALS performance with respect to the presented drawbacks is to initialize it with either ATLD or SWATLD steps, obtaining two integrated ALS procedures. The eectiveness of this methodology is demonstrated by comparing the results of standard ALS with the ones of the proposed integrated ALS variants in an extensive simulation design

    Statistical tools for student evaluation of academic educational quality

    No full text
    Measuring academic educational quality presents three major difficulties, typical of all customer satisfaction and service quality studies: the use of subjective scales; the ordinal nature of the data; and the multifold structure of satisfaction. In order to solve these problems, principal component analysis (PCA) of compositional data is proposed in this work. The core idea behind this methodology is to analyze by PCA the relative information within the data rather than focusing on absolute scores. This approach is discussed in comparison with a widely used Item Response Theory method (the Partial Credit Model) in order to assess its merits, e.g. always identifying a coherent preference structure. Both procedures were, thus, carried out on a real dataset collected with the 2013/14 ANVUR questionnaire by L’Universita´ di Napoli-L’Orientale

    Partial Least Squares for Compositional Canonical Correlation

    No full text
    Compositional data are quantitative descriptions of the parts of some whole, conveying relative information. The relationship between two sets of compositional descriptors can be explored by use of Canonical Correlation analysis with a procedure based on Partial Least Squares (PLS). This method offers a way to deal with matrix singularity in an efficient fashion and presents the further advantage of being easy to interpret. In order to fully explore the potential of PLS for analyzing the relationships between two sets of compositions, the performances of the NIPALS, SIMPLS and Kernel algorithms are compared on simulated data
    • …
    corecore