17,135 research outputs found
Process Mining of Programmable Logic Controllers: Input/Output Event Logs
This paper presents an approach to model an unknown Ladder Logic based
Programmable Logic Controller (PLC) program consisting of Boolean logic and
counters using Process Mining techniques. First, we tap the inputs and outputs
of a PLC to create a data flow log. Second, we propose a method to translate
the obtained data flow log to an event log suitable for Process Mining. In a
third step, we propose a hybrid Petri net (PN) and neural network approach to
approximate the logic of the actual underlying PLC program. We demonstrate the
applicability of our proposed approach on a case study with three simulated
scenarios
On the role of pre and post-processing in environmental data mining
The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed
Call Graph Evolution Analytics over a Version Series of an Evolving Software System
Call Graph evolution analytics can aid a software engineer when maintaining
or evolving a software system. This paper proposes Call Graph Evolution
Analytics to extract information from an evolving call graph ECG = CG_1,
CG_2,... CG_N for their version series VS = V_1, V_2, ... V_N of an evolving
software system. This is done using Call Graph Evolution Rules (CGERs) and Call
Graph Evolution Subgraphs (CGESs). Similar to association rule mining, the
CGERs are used to capture co-occurrences of dependencies in the system. Like
subgraph patterns in a call graph, the CGESs are used to capture evolution of
dependency patterns in evolving call graphs. Call graph analytics on the
evolution in these patterns can identify potentially affected dependencies (or
procedure calls) that need attention. The experiments are done on the evolving
call graphs of 10 large evolving systems to support dependency evolution
management. We also consider results from a detailed study for evolving call
graphs of Maven-Core's version series
Molecular Model of Dynamic Social Network Based on E-mail communication
In this work we consider an application of physically inspired sociodynamical model to the modelling of the evolution of email-based social network. Contrary to the standard approach of sociodynamics, which assumes expressing of system dynamics with heuristically defined simple rules, we postulate the inference of these rules from the real data and their application within a dynamic molecular model. We present how to embed the n-dimensional social space in Euclidean one. Then, inspired by the Lennard-Jones potential, we define a data-driven social potential function and apply the resultant force to a real e-mail communication network in a course of a molecular simulation, with network nodes taking on the role of interacting particles. We discuss all steps of the modelling process, from data preparation, through embedding and the molecular simulation itself, to transformation from the embedding space back to a graph structure. The conclusions, drawn from examining the resultant networks in stable, minimum-energy states, emphasize the role of the embedding process projecting the nonâmetric social graph into the Euclidean space, the significance of the unavoidable loss of information connected with this procedure and the resultant preservation of global rather than local properties of the initial network. We also argue applicability of our method to some classes of problems, while also signalling the areas which require further research in order to expand this applicability domain
BCFA: Bespoke Control Flow Analysis for CFA at Scale
Many data-driven software engineering tasks such as discovering programming
patterns, mining API specifications, etc., perform source code analysis over
control flow graphs (CFGs) at scale. Analyzing millions of CFGs can be
expensive and performance of the analysis heavily depends on the underlying CFG
traversal strategy. State-of-the-art analysis frameworks use a fixed traversal
strategy. We argue that a single traversal strategy does not fit all kinds of
analyses and CFGs and propose bespoke control flow analysis (BCFA). Given a
control flow analysis (CFA) and a large number of CFGs, BCFA selects the most
efficient traversal strategy for each CFG. BCFA extracts a set of properties of
the CFA by analyzing the code of the CFA and combines it with properties of the
CFG, such as branching factor and cyclicity, for selecting the optimal
traversal strategy. We have implemented BCFA in Boa, and evaluated BCFA using a
set of representative static analyses that mainly involve traversing CFGs and
two large datasets containing 287 thousand and 162 million CFGs. Our results
show that BCFA can speedup the large scale analyses by 1%-28%. Further, BCFA
has low overheads; less than 0.2%, and low misprediction rate; less than 0.01%.Comment: 12 page
Prediction of Emerging Technologies Based on Analysis of the U.S. Patent Citation Network
The network of patents connected by citations is an evolving graph, which
provides a representation of the innovation process. A patent citing another
implies that the cited patent reflects a piece of previously existing knowledge
that the citing patent builds upon. A methodology presented here (i) identifies
actual clusters of patents: i.e. technological branches, and (ii) gives
predictions about the temporal changes of the structure of the clusters. A
predictor, called the {citation vector}, is defined for characterizing
technological development to show how a patent cited by other patents belongs
to various industrial fields. The clustering technique adopted is able to
detect the new emerging recombinations, and predicts emerging new technology
clusters. The predictive ability of our new method is illustrated on the
example of USPTO subcategory 11, Agriculture, Food, Textiles. A cluster of
patents is determined based on citation data up to 1991, which shows
significant overlap of the class 442 formed at the beginning of 1997. These new
tools of predictive analytics could support policy decision making processes in
science and technology, and help formulate recommendations for action
- âŠ