156,634 research outputs found
A computational framework to emulate the human perspective in flow cytometric data analysis
Background: In recent years, intense research efforts have focused on developing methods for automated flow cytometric data analysis. However, while designing such applications, little or no attention has been paid to the human perspective that is absolutely central to the manual gating process of identifying and characterizing cell populations. In particular, the assumption of many common techniques that cell populations could be modeled reliably with pre-specified distributions may not hold true in real-life samples, which can have populations of arbitrary shapes and considerable inter-sample variation.
<p/>Results: To address this, we developed a new framework flowScape for emulating certain key aspects of the human perspective in analyzing flow data, which we implemented in multiple steps. First, flowScape begins with creating a mathematically rigorous map of the high-dimensional flow data landscape based on dense and sparse regions defined by relative concentrations of events around modes. In the second step, these modal clusters are connected with a global hierarchical structure. This representation allows flowScape to perform ridgeline analysis for both traversing the landscape and isolating cell populations at different levels of resolution. Finally, we extended manual gating with a new capacity for constructing templates that can identify target populations in terms of their relative parameters, as opposed to the more commonly used absolute or physical parameters. This allows flowScape to apply such templates in batch mode for detecting the corresponding populations in a flexible, sample-specific manner. We also demonstrated different applications of our framework to flow data analysis and show its superiority over other analytical methods.
<p/>Conclusions: The human perspective, built on top of intuition and experience, is a very important component of flow cytometric data analysis. By emulating some of its approaches and extending these with automation and rigor, flowScape provides a flexible and robust framework for computational cytomics
Liveness-Driven Random Program Generation
Randomly generated programs are popular for testing compilers and program
analysis tools, with hundreds of bugs in real-world C compilers found by random
testing. However, existing random program generators may generate large amounts
of dead code (computations whose result is never used). This leaves relatively
little code to exercise a target compiler's more complex optimizations.
To address this shortcoming, we introduce liveness-driven random program
generation. In this approach the random program is constructed bottom-up,
guided by a simultaneous structural data-flow analysis to ensure that the
generator never generates dead code.
The algorithm is implemented as a plugin for the Frama-C framework. We
evaluate it in comparison to Csmith, the standard random C program generator.
Our tool generates programs that compile to more machine code with a more
complex instruction mix.Comment: Pre-proceedings paper presented at the 27th International Symposium
on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur,
Belgium, 10-12 October 2017 (arXiv:1708.07854
Parameterized Construction of Program Representations for Sparse Dataflow Analyses
Data-flow analyses usually associate information with control flow regions.
Informally, if these regions are too small, like a point between two
consecutive statements, we call the analysis dense. On the other hand, if these
regions include many such points, then we call it sparse. This paper presents a
systematic method to build program representations that support sparse
analyses. To pave the way to this framework we clarify the bibliography about
well-known intermediate program representations. We show that our approach, up
to parameter choice, subsumes many of these representations, such as the SSA,
SSI and e-SSA forms. In particular, our algorithms are faster, simpler and more
frugal than the previous techniques used to construct SSI - Static Single
Information - form programs. We produce intermediate representations isomorphic
to Choi et al.'s Sparse Evaluation Graphs (SEG) for the family of data-flow
problems that can be partitioned per variables. However, contrary to SEGs, we
can handle - sparsely - problems that are not in this family
Application of discontinuity layout optimization to plane plasticity problems
A new and potentially widely applicable numerical analysis procedure for continuum mechanics problems is described. The procedure is used here to determine the critical layout of discontinuities and associated upper-bound limit load for plane plasticity problems. Potential discontinuities, which interlink nodes laid out over the body under consideration, are permitted to crossover one another giving a much wider search space than when such discontinuities are located only at the edges of finite elements of fixed topology. Highly efficient linear programming solvers can be employed when certain popular failure criteria are specified (e. g. Tresca or Mohr Coulomb in plane strain). Stress/velocity singularities are automatically identified and visual interpretation of the output is straightforward. The procedure, coined 'discontinuity layout optimization' (DLO), is related to that used to identify the optimum layout of bars in trusses, with discontinuities (e. g. slip-lines) in a translational failure mechanism corresponding to bars in an optimum truss. Hence, a recently developed adaptive nodal connection strategy developed for truss layout optimization problems can advantageously be applied here. The procedure is used to identify critical translational failure mechanisms for selected metal forming and soil mechanics problems. Close agreement with the exact analytical solutions is obtained
Multivariate modality inference using Gaussian kernel
The number of modes (also known as modality) of a kernel density estimator (KDE) draws lots of interests and is important in practice. In this paper, we develop an inference framework on the modality of a KDE under multivariate setting using Gaussian kernel. We applied the modal clustering method proposed by [1] for mode hunting. A test statistic and its asymptotic distribution are derived to assess the significance of each mode. The inference procedure is applied on both simulated and real data sets
Region-based memory management for Mercury programs
Region-based memory management (RBMM) is a form of compile time memory
management, well-known from the functional programming world. In this paper we
describe our work on implementing RBMM for the logic programming language
Mercury. One interesting point about Mercury is that it is designed with strong
type, mode, and determinism systems. These systems not only provide Mercury
programmers with several direct software engineering benefits, such as
self-documenting code and clear program logic, but also give language
implementors a large amount of information that is useful for program analyses.
In this work, we make use of this information to develop program analyses that
determine the distribution of data into regions and transform Mercury programs
by inserting into them the necessary region operations. We prove the
correctness of our program analyses and transformation. To execute the
annotated programs, we have implemented runtime support that tackles the two
main challenges posed by backtracking. First, backtracking can require regions
removed during forward execution to be "resurrected"; and second, any memory
allocated during a computation that has been backtracked over must be recovered
promptly and without waiting for the regions involved to come to the end of
their life. We describe in detail our solution of both these problems. We study
in detail how our RBMM system performs on a selection of benchmark programs,
including some well-known difficult cases for RBMM. Even with these difficult
cases, our RBMM-enabled Mercury system obtains clearly faster runtimes for 15
out of 18 benchmarks compared to the base Mercury system with its Boehm runtime
garbage collector, with an average runtime speedup of 24%, and an average
reduction in memory requirements of 95%. In fact, our system achieves optimal
memory consumption in some programs.Comment: 74 pages, 23 figures, 11 tables. A shorter version of this paper,
without proofs, is to appear in the journal Theory and Practice of Logic
Programming (TPLP
Heap Reference Analysis Using Access Graphs
Despite significant progress in the theory and practice of program analysis,
analysing properties of heap data has not reached the same level of maturity as
the analysis of static and stack data. The spatial and temporal structure of
stack and static data is well understood while that of heap data seems
arbitrary and is unbounded. We devise bounded representations which summarize
properties of the heap data. This summarization is based on the structure of
the program which manipulates the heap. The resulting summary representations
are certain kinds of graphs called access graphs. The boundedness of these
representations and the monotonicity of the operations to manipulate them make
it possible to compute them through data flow analysis.
An important application which benefits from heap reference analysis is
garbage collection, where currently liveness is conservatively approximated by
reachability from program variables. As a consequence, current garbage
collectors leave a lot of garbage uncollected, a fact which has been confirmed
by several empirical studies. We propose the first ever end-to-end static
analysis to distinguish live objects from reachable objects. We use this
information to make dead objects unreachable by modifying the program. This
application is interesting because it requires discovering data flow
information representing complex semantics. In particular, we discover four
properties of heap data: liveness, aliasing, availability, and anticipability.
Together, they cover all combinations of directions of analysis (i.e. forward
and backward) and confluence of information (i.e. union and intersection). Our
analysis can also be used for plugging memory leaks in C/C++ languages.Comment: Accepted for printing by ACM TOPLAS. This version incorporates
referees' comment
Fine root dynamics and trace gas fluxes in two lowland tropical forest soils
Fine root dynamics have the potential to contribute significantly to ecosystem-scale biogeochemical cycling, including the production and emission of greenhouse gases. This is particularly true in tropical forests which are often characterized as having large fine root biomass and rapid rates of root production and decomposition. We examined patterns in fine root dynamics on two soil types in a lowland moist Amazonian forest, and determined the effect of root decay on rates of C and N trace gas fluxes. Root production averaged 229 ( 35) and 153 ( 27) gm 2 yr 1 for years 1 and 2 of the study, respectively, and did not vary significantly with soil texture. Root decay was sensitive to soil texture with faster rates in the clay soil (k5 0.96 year 1) than in the sandy loam soil (k5 0.61 year 1),leading to greater standing stocks of dead roots in the sandy loam. Rates of nitrous oxide (N2O) emissions were significantly greater in the clay soil (13 1ngNcm 2 h 1) than in the sandy loam (1.4 0.2 ngNcm 2 h 1). Root mortality and decay following trenching doubled rates of N2O emissions in the clay and tripled them in sandy loam over a 1-year period. Trenching also increased nitric oxide fluxes, which were greater in the sandy loam than in the clay. We used trenching (clay only) and a mass balance approach to estimate the root contribution to soil respiration. In clay soil root respiration was 264–380 gCm 2 yr 1, accounting for 24% to 35% of the total soil CO2 efflux. Estimates were similar using both approaches. In sandy loam, root respiration rates were slightly higher and more variable (521 206 gCm2 yr 1) and contributed 35% of the total soil respiration. Our results show that soil heterotrophs strongly dominate soil respiration in this forest, regardless of soil texture. Our results also suggest that fine root mortality and decomposition associated with disturbance and land-use change can contribute significantly to increased rates of nitrogen trace gas emissions
- …