43 research outputs found
CARET analysis of multithreaded programs
Dynamic Pushdown Networks (DPNs) are a natural model for multithreaded
programs with (recursive) procedure calls and thread creation. On the other
hand, CARET is a temporal logic that allows to write linear temporal formulas
while taking into account the matching between calls and returns. We consider
in this paper the model-checking problem of DPNs against CARET formulas. We
show that this problem can be effectively solved by a reduction to the
emptiness problem of B\"uchi Dynamic Pushdown Systems. We then show that CARET
model checking is also decidable for DPNs communicating with locks. Our results
can, in particular, be used for the detection of concurrent malware.Comment: Pre-proceedings paper presented at the 27th International Symposium
on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur,
Belgium, 10-12 October 2017 (arXiv:1708.07854
Generating a Performance Stochastic Model from UML Specifications
Since its initiation by Connie Smith, the process of Software Performance
Engineering (SPE) is becoming a growing concern. The idea is to bring
performance evaluation into the software design process. This suitable
methodology allows software designers to determine the performance of software
during design. Several approaches have been proposed to provide such
techniques. Some of them propose to derive from a UML (Unified Modeling
Language) model a performance model such as Stochastic Petri Net (SPN) or
Stochastic process Algebra (SPA) models. Our work belongs to the same category.
We propose to derive from a UML model a Stochastic Automata Network (SAN) in
order to obtain performance predictions. Our approach is more flexible due to
the SAN modularity and its high resemblance to UML' state-chart diagram
A Navigation Logic for Recursive Programs with Dynamic Thread Creation
Dynamic Pushdown Networks (DPNs) are a model for multithreaded programs with
recursion and dynamic creation of threads. In this paper, we propose a temporal
logic called NTL for reasoning about the call- and return- as well as thread
creation behaviour of DPNs. Using tree automata techniques, we investigate the
model checking problem for the novel logic and show that its complexity is not
higher than that of LTL model checking against pushdown systems despite a more
expressive logic and a more powerful system model. The same holds true for the
satisfiability problem when compared to the satisfiability problem for a
related logic for reasoning about the call- and return-behaviour of pushdown
systems. Overall, this novel logic offers a promising approach for the
verification of recursive programs with dynamic thread creation
STATISTICAL MACHINE LEARNING BASED MODELING FRAMEWORK FOR DESIGN SPACE EXPLORATION AND RUN-TIME CROSS-STACK ENERGY OPTIMIZATION FOR MANY-CORE PROCESSORS
The complexity of many-core processors continues to grow as a larger number of heterogeneous cores are integrated on a single chip. Such systems-on-chip contains computing structures ranging from complex out-of-order cores, simple in-order cores, digital signal processors (DSPs), graphic processing units (GPUs), application specific processors, hardware accelerators, I/O subsystems, network-on-chip interconnects, and large caches arranged in complex hierarchies. While the industry focus is on putting higher number of cores on a single chip, the key challenge is to optimally architect these many-core processors such that performance, energy and area constraints are satisfied. The traditional approach to processor design through extensive cycle accurate simulations are ill-suited for designing many-core processors due to the large microarchitecture design space that must be explored. Additionally it is hard to optimize such complex processors and the applications that run on them statically at design time such that performance and energy constraints are met under dynamically changing operating conditions.
The dissertation establishes statistical machine learning based modeling framework that enables the efficient design and operation of many-core processors that meets performance, energy and area constraints. We apply the proposed framework to rapidly design the microarchitecture of a many-core processor for multimedia, computer graphics rendering, finance, and data mining applications derived from the Parsec benchmark. We further demonstrate the application of the framework in the joint run-time adaptation of both the application and microarchitecture such that energy availability
constraints are met
Using machine learning to improve dense and sparse matrix multiplication kernels
This work is comprised of two different projects in numerical linear algebra. The first project is about using machine learning to speed up dense matrix-matrix multiplication computations on a shared-memory computer architecture. We found that found basic loop-based matrix-matrix multiplication algorithms tied to a decision tree algorithm selector were competitive to using Intel\u27s Math Kernel Library for the same computation. The second project is a preliminary report about re-implementing an encoding format for spare matrix-vector multiplication called Compressed Spare eXtended (CSX). The goal for the second project is to use machine learning to aid in encoding matrix substructures in the CSX format without using exhaustive search and a Just-In-Time compiler
Machine Learning in clinical biology and medicine: from prediction of multidrug resistant infections in humans to pre-mRNA splicing control in Ciliates
Machine Learning methods have broadly begun to infiltrate the clinical literature in
such a way that the correct use of algorithms and tools can facilitate both diagnosis and
therapies. The availability of large quantities of high-quality data could lead to an
improved understanding of risk factors in community and healthcare-acquired
infections. In the first part of my PhD program, I refined my skills in Machine Learning
by developing and evaluate with a real antibiotic stewardship dataset, a model useful
to predict multi-drugs resistant urinary tract infections after patient hospitalization9
.
For this purpose, I created an online platform called DSaaS specifically designed for
healthcare operators to train ML models (supervised learning algorithms). These
results are reported in Chapter 2.
In the second part of the PhD thesis (Chapter 3) I used my new skills to study the
genomic variants, in particular the phenomenon of intron splicing. One of the important
modes of pre-mRNA post-transcriptional modification is alternative intron splicing,
that includes intron retention (unsplicing), allowing the creation of many distinct
mature mRNA transcripts from a single gene. An accurate interpretation of genomic
variants is the backbone of genomic medicine. Determining for example the causative
variant in patients with Mendelian disorders facilitates both management and potential
downstream treatment of the patient’s condition, as well as providing peace of mind
and allowing more effective counselling for the wider family. Recent years have seen a surge in bioinformatics tools designed to predict variant
impact on splicing, and these offer an opportunity to circumvent many limitations of
RNA-seq based approaches. An increasing number of these tools rely on machine
learning computational approaches that can identify patterns in data and use this
knowledge to speculate on new data.
I optimized a pipeline to extract and classify introns from genomes and transcriptomes
and I classified them into retained (Ris) and constitutively spliced (CSIs) introns. I used
data from ciliates for the peculiar organization of their genomes (enriched of coding
sequences) and because they are unicellular organisms without cells differentiated into
tissues. That made easier the identification and the manipulation of introns. In
collaboration with the PhD colleague dr. Leonardo Vito, I analyzed these intronic
sequences in order to identify “features” to predict and to classify them by Machine
Learning algorithms. We also developed a platform useful to manipulate FASTA, gtf,
BED, etc. files produced by the pipeline tools. I named the platform: Biounicam (intron
extraction tools) available at http://46.23.201.244:1880/ui.
The major objective of this study was to develop an accurate machine-learning model
that can predict whether an intron will be retained or not, to understand the key-features
involved in the intron retention mechanism, and provide insight on the factors that drive
IR. Once the model has been developed, the final step of my PhD work will be to
expand the platform with different machine learning algorithms to better predict the
retention and to test new features that drive this phenomenon. These features hopefully
will contribute to find new mechanisms that controls intron splicing. The other additional papers and patents I published during my PhD program are in
Appendix B and C. These works have enriched me with many useful techniques for
future works and ranged from microbiology to classical statistics
Development of benthic monitoring approaches for salmon aquaculture sites using machine learning, hydroacoustic data and bacterial eDNA
Intensive caged salmon production can lead to localized perturbations of the seafloor environment where organic waste (flocculent matter) accumulates and disrupts ecological processes. As the aquaculture industry expands, the development of tools to rapidly detect changes in seafloor condition is critical. Here, we examine whether applying machine learning to two types of monitoring data could improve environmental assessments at aquaculture sites in Newfoundland. First, we apply machine learning to single beam echosounder data to detect flocculent matter at aquaculture sites over larger areas than currently achieved used drop camera imaging. Then, we use machine learning to categorize sediments by levels of disturbance based on bacterial tetranucleotide frequency distributions generated from environmental DNA. While echosounder data can detect flocculent matter with moderate success in this region, bacterial tetranucleotide frequencies are highly effective classifiers of benthic disturbance; this simplified environmental DNA-based approach could be implemented within novel aquaculture benthic monitoring pipelines
A debiasing technique for place-based algorithmic patrol management
In recent years, there has been a revolution in data-driven policing. With
that has come scrutiny on how bias in historical data affects algorithmic
decision making. In this exploratory work, we introduce a debiasing technique
for place-based algorithmic patrol management systems. We show that the
technique efficiently eliminates racially biased features while retaining high
accuracy in the models. Finally, we provide a lengthy list of potential future
research in the realm of fairness and data-driven policing which this work
uncovered.Comment: 20 pages (91 Appendix pages), 6 figures (20 supplementary figures),
14 supplementary table
Introduction to Runtime Verification
International audienceThe aim of this chapter is to act as a primer for those wanting to learn about Runtime Verification (RV). We start by providing an overview of the main specification languages used for RV. We then introduce the standard terminology necessary to describe the monitoring problem, covering the pragmatic issues of monitoring and instrumentation, and discussing extensively the monitorability problem