Search CORE

13,291 research outputs found

Mining complete, precise and simple process models

Author: Vázquez Barreiros Borja
Publication venue
Publication date: 01/01/2017
Field of study

Process discovery algorithms are generally used to discover the underlying process that has been followed to achieve an objective. In general, these algorithms do not take into account any domain knowledge to derive process models, allowing to apply them in a general manner. However, depending on the selected approach, a different kind of process models can be discovered, as each technique has its strengths and weaknesses, e.g., the expressiveness of the used notation. Hence, it is important to take into account the requirements of the domain when deciding which algorithm to use, as the correct assumptions can lead to richer process models. For instance, among the different domains of application of process mining we can identify several fields that share an interesting requirement about the discovered process models. In security audits, discovered processes have to fulfill strict requisites. This means that the process model should reproduce as much behavior as possible; otherwise some violations may go undetected (replay fitness). On the other hand, in order to avoid false positives, process models should reproduce only the recorded behavior (precision). Finally, process models should be easily readable to better detect deviations (simplicity). Another clear example concerns the educational domain, as in order to be of value for both teachers and learners, a discovered learning process should satisfy the aforementioned requirements. That is, to guarantee feasible and correct evaluations, teachers need to access to all the activities performed by learners, thereby the learning process should be able to reproduce as much behavior as possible (replay fitness). Furthermore, the learning process should focus on the recorded behavior seen in the event log (precision), i.e., show only what the students did, and not what they might have done, while being easily interpretable by the teachers (simplicity). One of the previous requirements is related to the readability of process models: simplicity. In process mining, one of the identified challenges is the appropriate visualization of process models, i.e., to present the results of process discovery in such a way that people actually gain insights about the process. Process models that are unnecessary complex can hinder the real behavior of the process rather than to provide an intuition of what is really happening in an organization. However, achieving a good level of readability is not always straightforward, for instance, due the used representation. Within the different approaches focused to reduce the complexity of a process model, the interest in this PhD Thesis relies on two techniques. On the one hand, to improve the readability of an already discovered process model through the inclusion of duplicate labels. On the other hand, the hierarchization of a process model, i.e., to provide a well known structure to the process model. However, regarding the latter, this technique requires to take into account domain knowledge, as different domains may rely on different requirements when improving the readability of the process model. In other words, in order to improve the interpretability and understandability of a process model, the hierarchization has to be driven by the domain. To sum up, concerning the aim of this PhD Thesis, we can identify two main topics of interest. On the one hand, we are interested in retrieving process models that reproduce as much behavior recorded in the log as possible, without introducing unseen behavior. On the other hand, we try to reduce the complexity of the mined models in order to improve their readability. Hence, the aim of this PhD Thesis is to discover process models considering replay fitness, precision and simplicity, while paying special attention in retrieving highly interpretable process models

Evolving temporal association rules with genetic algorithms

Author: A. Ghandar
C.-Y. Chang
F. Herrera
J. Alcala-Fdez
J.H. Holland
K.A. Jong De
P.-N Tan
R. Agrawal
R. Agrawal
S. Laxman
X. Yan
Y. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

A novel framework for mining temporal association rules by discovering itemsets with a genetic algorithm is introduced. Metaheuristics have been applied to association rule mining, we show the efficacy of extending this to another variant - temporal association rule mining. Our framework is an enhancement to existing temporal association rule mining methods as it employs a genetic algorithm to simultaneously search the rule space and temporal space. A methodology for validating the ability of the proposed framework isolates target temporal itemsets in synthetic datasets. The Iterative Rule Learning method successfully discovers these targets in datasets with varying levels of difficulty

CiteSeerX

Inaccurate, Costly, and Inefficient: Evidence That America's Voter Registration System Needs an Upgrade

Author
Publication venue: Pew Center on the States
Publication date: 02/02/2012
Field of study

Outlines the cost of inaccurate or multiple registrations, registrations of the deceased, and unregistered eligible voters and plans to update lists via data comparison with other sources, proven data-matching techniques, and expanded online registration

Is There an App for That? Electronic Health Records (EHRs) and a New Environment of Conflict Prevention and Resolution

Author: Dullabh Prashila
Katsh Ethan
Sondheimer Norman
Stromberg Samuel
Publication venue: Duke University School of Law
Publication date: 01/07/2011
Field of study

Katsh discusses the new problems that are a consequence of a new technological environment in healthcare, one that has an array of elements that makes the emergence of disputes likely. Novel uses of technology have already addressed both the problem and its source in other contexts, such as e-commerce, where large numbers of transactions have generated large numbers of disputes. If technology-supported healthcare is to improve the field of medicine, a similar effort at dispute prevention and resolution will be necessary