89,832 research outputs found

    Comparison of Similarity Measures for Trajectory Clustering - Aviation Use Case

    Get PDF
    Various distance-based clustering algorithms have been reported, but the core component of all of them is a similarity or distance measure for classification of data. Rather than setting the priority to comparison of the performance of different clustering algorithms, it may be worthy to analyze the influence of different similarity measures on the results of clustering algorithms. The main contribution of this work is a comparative study of the impact of 9 similarity measures on similarity-based trajectory clustering using DBSCAN algorithm for commercial flight dataset. The novelty in this comparison is exploring the robustness of the clustering algorithm with respect to algorithm parameter. We evaluate the accuracy of clustering, accuracy of anomaly detection, algorithmic efficiency, and we determine the behavior profile for each measure. We show that DTW and Frechet distance lead to the best clustering results, while LCSS and Hausdorff Cosine should be avoided for this task

    Techniques for clustering gene expression data

    Get PDF
    Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered

    QCD-aware partonic jet clustering for truth-jet flavour labelling

    Get PDF
    We present an algorithm for deriving partonic flavour labels to be applied to truth particle jets in Monte Carlo event simulations. The inputs to this approach are final pre-hadronization partons, to remove dependence on unphysical details such as the order of matrix element calculation and shower generator frame recoil treatment. These are clustered using standard jet algorithms, modified to restrict the allowed pseudojet combinations to those in which tracked flavour labels are consistent with QCD and QED Feynman rules. The resulting algorithm is shown to be portable between the major families of shower generators, and largely insensitive to many possible systematic variations: it hence offers significant advantages over existing ad hoc labelling schemes. However, it is shown that contamination from multi-parton scattering simulations can disrupt the labelling results. Suggestions are made for further extension to incorporate more detailed QCD splitting function kinematics, robustness improvements, and potential uses for truth-level physics object definitions and tagging
    corecore