89,832 research outputs found
Comparison of Similarity Measures for Trajectory Clustering - Aviation Use Case
Various distance-based clustering algorithms have been reported, but the core component of all of them is a similarity or distance measure for classification of data. Rather than setting the priority to comparison of the performance of different clustering algorithms, it may be worthy to analyze the influence of different similarity measures on the results of clustering algorithms. The main contribution of this work is a comparative study of the impact of 9 similarity measures on similarity-based trajectory clustering using DBSCAN algorithm for commercial flight dataset. The novelty in this comparison is exploring the robustness of the clustering algorithm with respect to algorithm parameter. We evaluate the accuracy of clustering, accuracy of anomaly detection, algorithmic efficiency, and we determine the behavior profile for each measure. We show that DTW and Frechet distance lead to the best clustering results, while LCSS and Hausdorff Cosine should be avoided for this task
Techniques for clustering gene expression data
Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered
QCD-aware partonic jet clustering for truth-jet flavour labelling
We present an algorithm for deriving partonic flavour labels to be applied to
truth particle jets in Monte Carlo event simulations. The inputs to this
approach are final pre-hadronization partons, to remove dependence on
unphysical details such as the order of matrix element calculation and shower
generator frame recoil treatment. These are clustered using standard jet
algorithms, modified to restrict the allowed pseudojet combinations to those in
which tracked flavour labels are consistent with QCD and QED Feynman rules. The
resulting algorithm is shown to be portable between the major families of
shower generators, and largely insensitive to many possible systematic
variations: it hence offers significant advantages over existing ad hoc
labelling schemes. However, it is shown that contamination from multi-parton
scattering simulations can disrupt the labelling results. Suggestions are made
for further extension to incorporate more detailed QCD splitting function
kinematics, robustness improvements, and potential uses for truth-level physics
object definitions and tagging
- …