3,944,696 research outputs found
Object-based Control/Data-flow Analysis
Not only does a clear distinction between control and data flow enhance the readability of models, but it also allows different tools to operate on the two distinct parts of the model. This paper shows how the modelling based on control/data-flow analysis can benefit from an object-based approach. We have developed a translation mechanism that is faithful and gives an extra dimension (hierarchy) to the existing paradigm of control and data flow interacting in a model. Our methodology provides a comprehensible separation of these two parts, which can be used to feed another analysis or synthesis tools, while still being able to reason about both parts through formal methods of verification
Data-flow Analysis of Programs with Associative Arrays
Dynamic programming languages, such as PHP, JavaScript, and Python, provide
built-in data structures including associative arrays and objects with similar
semantics-object properties can be created at run-time and accessed via
arbitrary expressions. While a high level of security and safety of
applications written in these languages can be of a particular importance
(consider a web application storing sensitive data and providing its
functionality worldwide), dynamic data structures pose significant challenges
for data-flow analysis making traditional static verification methods both
unsound and imprecise. In this paper, we propose a sound and precise approach
for value and points-to analysis of programs with associative arrays-like data
structures, upon which data-flow analyses can be built. We implemented our
approach in a web-application domain-in an analyzer of PHP code.Comment: In Proceedings ESSS 2014, arXiv:1405.055
Information Preserving Component Analysis: Data Projections for Flow Cytometry Analysis
Flow cytometry is often used to characterize the malignant cells in leukemia
and lymphoma patients, traced to the level of the individual cell. Typically,
flow cytometric data analysis is performed through a series of 2-dimensional
projections onto the axes of the data set. Through the years, clinicians have
determined combinations of different fluorescent markers which generate
relatively known expression patterns for specific subtypes of leukemia and
lymphoma -- cancers of the hematopoietic system. By only viewing a series of
2-dimensional projections, the high-dimensional nature of the data is rarely
exploited. In this paper we present a means of determining a low-dimensional
projection which maintains the high-dimensional relationships (i.e.
information) between differing oncological data sets. By using machine learning
techniques, we allow clinicians to visualize data in a low dimension defined by
a linear combination of all of the available markers, rather than just 2 at a
time. This provides an aid in diagnosing similar forms of cancer, as well as a
means for variable selection in exploratory flow cytometric research. We refer
to our method as Information Preserving Component Analysis (IPCA).Comment: 26 page
On the computational complexity of Data Flow Analysis
We consider the problem of Data Flow Analysis over monotone data flow
frameworks with a finite lattice. The problem of computing the Maximum Fixed
Point (MFP) solution is shown to be P-complete even when the lattice has just
four elements. This shows that the problem is unlikely to be efficiently
parallelizable. It is also shown that the problem of computing the Meet Over
all Paths (MOP) solution is NL-complete (and hence efficiently parallelizable)
when the lattice is finite even for non-monotone data flow frameworks. These
results appear in contrast with the fact that when the lattice is not finite,
solving the MOP problem is undecidable and hence significantly harder than the
MFP problem which is polynomial time computable for lattices of finite height.Comment: 7 pages 4 figure
TRACTABLE DATA-FLOW ANALYSIS FOR DISTRIBUTED SYSTEMS
Automated behavior analysis is a valuable technique in the development and maintainence of distributed systems. In this paper, we present a tractable dataflow analysis technique for the detection of unreachable states and actions in distributed systems. The technique follows an approximate approach described by Reif and Smolka, but delivers a more accurate result in assessing unreachable states and actions. The higher accuracy is achieved by the use of two concepts: action dependency and history sets. Although the technique, does not exhaustively detect all possible errors, it detects nontrivial errors with a worst-case complexity quadratic to the system size. It can be automated and applied to systems with arbitrary loops and nondeterministic structures. The technique thus provides practical and tractable behavior analysis for preliminary designs of distributed systems. This makes it an ideal candidate for an interactive checker in software development tools. The technique is illustrated with case studies of a pump control system and an erroneous distributed program. Results from a prototype implementation are presented
Enabling Operator Reordering in Data Flow Programs Through Static Code Analysis
In many massively parallel data management platforms, programs are
represented as small imperative pieces of code connected in a data flow. This
popular abstraction makes it hard to apply algebraic reordering techniques
employed by relational DBMSs and other systems that use an algebraic
programming abstraction. We present a code analysis technique based on reverse
data and control flow analysis that discovers a set of properties from user
code, which can be used to emulate algebraic optimizations in this setting.Comment: 4 pages, accepted and presented at the First International Workshop
on Cross-model Language Design and Implementation (XLDI), affiliated with
ICFP 2012, Copenhage
Cluster analysis of flow cytometric list mode data on a personal computer
A cluster analysis algorithm, dedicated to analysis of flow cytometric data is described. The algorithm is written in Pascal and implemented on an MS-DOS personal computer. It uses k-means, initialized with a large number of seed points, followed by a modified nearest neighbor technique to reduce the large number of subclusters. Thus we combine the advantage of the k-means (speed) with that of the nearest neighbor technique (accuracy). In order to achieve a rapid analysis, no complex data transformations such as principal components analysis were used. \ud
Results of the cluster analysis on both real and artificial flow cytometric data are presented and discussed. The results show that it is possible to get very good cluster analysis partitions, which compare favorably with manually gated analysis in both time and in reliability, using a personal computer
Hybrid Information Flow Analysis for Programs with Arrays
Information flow analysis checks whether certain pieces of (confidential)
data may affect the results of computations in unwanted ways and thus leak
information. Dynamic information flow analysis adds instrumentation code to the
target software to track flows at run time and raise alarms if a flow policy is
violated; hybrid analyses combine this with preliminary static analysis.
Using a subset of C as the target language, we extend previous work on hybrid
information flow analysis that handled pointers to scalars. Our extended
formulation handles arrays, pointers to array elements, and pointer arithmetic.
Information flow through arrays of pointers is tracked precisely while arrays
of non-pointer types are summarized efficiently.
A prototype of our approach is implemented using the Frama-C program analysis
and transformation framework. Work on a full machine-checked proof of the
correctness of our approach using Isabelle/HOL is well underway; we present the
existing parts and sketch the rest of the correctness argument.Comment: In Proceedings VPT 2016, arXiv:1607.0183
- …
