Search CORE

140,254 research outputs found

A Comparative Study of Dimensionality Reduction Techniques to Enhance Trace Clustering Performances

Author: Yang Hanna
Publication venue: Graduate School of UNIST, Master thesis
Publication date: 01/08/2012
Field of study

Technology Management/ Information System/ EntrepreneurshipProcess mining aims at extracting useful information from event logs. Recently, in order to improve processes, several organizations such as high-tech companies, hospitals, and municipalities utilize process mining techniques. Real-life process logs from such organizations are usually very large and complicated, since the process logs in general contain numerous activities which are executed by many employees. Furthermore, lots of real-life process logs generate spaghetti-like process models due to the complexity of processes. Traditional process mining techniques have problems with discovering and analyzing real-life process logs which come from less structured processes. To overcome the weaknesses of traditional process mining techniques, a trace clustering has been developed. The trace clustering splits an event log into several subsets, and each subset contains homogenous cases. Even though the trace clustering is useful to handle complex process logs, it is time-consuming and computationally expensive due to a large number of features generated from complex logs. In this thesis, we applied dimensionality reduction (preprocessing) techniques to the trace clustering in order to reduce the number of features. To validate our approach, we conducted experiments to discover relationships between dimensionality reduction techniques and clustering algorithms, and we performed a case study which involves patient treatment processes of a hospital. Among many dimensionality reduction techniques, we used three techniques namely singular value decomposition (SVD), random projection, and principal components analysis (PCA). The result shows that the trace clustering with dimensionality reduction techniques produce higher average fitness values. Furthermore, processing time of trace clustering is effectively reduced with dimensionality reduction techniques. Moreover, we measured similarity between clustering results to observe the degree of changes in clustering results while applying dimensionality reduction techniques. The similarity is resulted differently according to used clustering algorithm.ope

ScholarWorks@UNIST

Alignment-based trace clustering

Author: Carmona Vargas Josep
Chatain Thomas
Dongen Boudewijn F. van
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.Peer ReviewedPostprint (author's final draft

Repository TU/e

UPCommons. Portal del coneixement obert de la UPC

Pure OAI Repository

INRIA a CCSD electronic archive server

Angular power spectrum of gamma-ray sources for GLAST: blazars and clusters of galaxies

Author: Ando Shin'ichiro
Komatsu Eiichiro
Narumoto Takuro
Totani Tomonori
Publication venue: 'Wiley'
Publication date: 05/10/2006
Field of study

Blazars, a beamed population of active galactic nuclei, radiate high-energy gamma-rays, and thus are a good target for the Gamma Ray Large Area Space Telescope (GLAST). As the blazars trace the large-scale structure of the universe, one may observe spatial clustering of blazars. We calculate the angular power spectrum of blazars that would be detected by GLAST. We show that we have the best chance of detecting their clustering at large angular scales, \theta >~ 10 deg, where shot noise is less important, and the dominant contribution to the correlation comes from relatively low redshift, z <~ 0.1. The GLAST can detect the correlation signal, if the blazars detected by GLAST trace the distribution of low-z quasars observed by optical galaxy surveys, which have the bias of unity. If the bias of blazars is greater than 1.5, GLAST will detect the correlation signal unambiguously. We also find that GLAST may detect spatial clustering of clusters of galaxies in gamma-rays. The shape of the angular power spectrum is different for blazars and clusters of galaxies; thus, we can separate these two contributions on the basis of the shape of the power spectrum.Comment: 14 pages, 10 figures; added references; accepted by MNRA

arXiv.org e-Print Archive

Caltech Authors

CERN Document Server

Large-scale structure in a new deep IRAS galaxy redshift survey

Author: Broadhurst T J
Conrow T
Hacking P
Lawrence A
Lonsdale C J
McMahon R G
Oliver S
Rowan-Robinson M
Saunders W
Taylor A
Publication venue: Monthly Notices of the Royal Astronomical Society
Publication date: 01/01/1995
Field of study

We present here the first results from two recently completed, fully sampled redshift surveys comprising 3703 IRAS Faint Source Survey (FSS) galaxies. An unbiased counts-in-cells analysis finds a clustering strength in broad agreement with other recent redshift surveys and at odds with the standard cold dark matter model. We combine our data with those from the QDOT and 1.2 Jy surveys, producing a single estimate of the IRAS galaxy clustering strength. We compare the data with the power spectrum derived from a mixed dark matter universe. Direct comparison of the clustering strength seen in the IRAS samples with that seen in the APM-Stromlo survey suggests b_O/b_I=1.20+/-0.05 assuming a linear, scale independent biasing. We also perform a cell by cell comparison of our FSS-z sample with galaxies from the first CfA slice, testing the viability of a linear-biasing scheme linking the two. We are able to rule out models in which the FSS-z galaxies identically trace the CfA galaxies on scales 5-20h^{-1}Mpc. On scales of 5 and 10h^{-1}Mpc no linear-biasing model can be found relating the two samples. We argue that this result is expected since the CfA sample includes more elliptical galaxies which have different clustering properties from spirals. On scales of 20h^{-1}Mpc no linear-biasing model with b_O/b_I < 1.70 is acceptable. When comparing the FSS-z galaxies to the CfA spirals, however, the two populations trace the same structures within our uncertaintie

CiteSeerX

Sussex Research Online

Clustering of MgII absorption line systems around massive galaxies: an important constraint on feedback processes in galaxy formation

Author: Kauffmann Guinevere
Menard Brice
Nelson Dylan
Zhu Guangtun
Publication venue: 'Oxford University Press (OUP)'
Publication date: 14/03/2017
Field of study

We use the latest version of the metal line absorption catalogue of Zhu & M\'enard (2013) to study the clustering of MgII absorbers around massive galaxies (~10^11.5 M_sun), quasars and radio-loud AGN with redshifts between 0.4 and 0.75. Clustering is evaluated in two dimensions, by binning absorbers both in projected radius and in velocity separation. Excess MgII is detected around massive galaxies out to R_p=20 Mpc. At projected radii less than 800 kpc, the excess extends out to velocity separations of 10,000 km/s. The extent of the high velocity tail within this radius is independent of the mean stellar age of the galaxy and whether or not it harbours an active galactic nucleus. We interpret our results using the publicly available Illustris and Millennium simulations. Models where the MgII absorbers trace the dark matter particle or subhalo distributions do not fit the data. They overpredict the clustering on small scales and do not reproduce the excess high velocity separation MgII absorbers seen within the virial radius of the halo. The Illustris simulations which include thermal, but not mechanical feedback from AGN, also do not provide an adequate fit to the properties of the cool halo gas within the virial radius. We propose that the large velocity separation MgII absorbers trace gas that has been pushed out of the dark matter halos, possibly by multiple episodes of AGN-driven mechanical feedback acting over long timescales.Comment: 10 pages, 11 figures, accepted in MNRA

arXiv.org e-Print Archive

MPG.PuRe