2,846 research outputs found
Renaming Global Variables in C Mechanically Proved Correct
Most integrated development environments are shipped with refactoring tools.
However, their refactoring operations are often known to be unreliable. As a
consequence, developers have to test their code after applying an automatic
refactoring. In this article, we consider a refactoring operation (renaming of
global variables in C), and we prove that its core implementation preserves the
set of possible behaviors of transformed programs. That proof of correctness
relies on the operational semantics of C provided by CompCert C in Coq.Comment: In Proceedings VPT 2016, arXiv:1607.0183
An Ontology Based Method to Solve Query Identifier Heterogeneity in Post-Genomic Clinical Trials
The increasing amount of information available for biomedical research has led to issues related to knowledge discovery in large collections of data. Moreover, Information Retrieval techniques must consider heterogeneities present in databases, initially belonging to different domains—e.g. clinical and genetic data. One of the goals, among others, of the ACGT European is to provide seamless and homogeneous access to integrated databases. In this work, we describe an approach to overcome heterogeneities in identifiers inside queries. We present an ontology classifying the most common identifier semantic heterogeneities, and a service that makes use of it to cope with the problem using the described approach. Finally, we illustrate the solution by analysing a set of real queries
Interactive exploration of population scale pharmacoepidemiology datasets
Population-scale drug prescription data linked with adverse drug reaction
(ADR) data supports the fitting of models large enough to detect drug use and
ADR patterns that are not detectable using traditional methods on smaller
datasets. However, detecting ADR patterns in large datasets requires tools for
scalable data processing, machine learning for data analysis, and interactive
visualization. To our knowledge no existing pharmacoepidemiology tool supports
all three requirements. We have therefore created a tool for interactive
exploration of patterns in prescription datasets with millions of samples. We
use Spark to preprocess the data for machine learning and for analyses using
SQL queries. We have implemented models in Keras and the scikit-learn
framework. The model results are visualized and interpreted using live Python
coding in Jupyter. We apply our tool to explore a 384 million prescription data
set from the Norwegian Prescription Database combined with a 62 million
prescriptions for elders that were hospitalized. We preprocess the data in two
minutes, train models in seconds, and plot the results in milliseconds. Our
results show the power of combining computational power, short computation
times, and ease of use for analysis of population scale pharmacoepidemiology
datasets. The code is open source and available at:
https://github.com/uit-hdl/norpd_prescription_analyse
An overview of Mirjam and WeaveC
In this chapter, we elaborate on the design of an industrial-strength aspectoriented programming language and weaver for large-scale software development. First, we present an analysis on the requirements of a general purpose aspect-oriented language that can handle crosscutting concerns in ASML software. We also outline a strategy on working with aspects in large-scale software development processes. In our design, we both re-use existing aspect-oriented language abstractions and propose new ones to address the issues that we identified in our analysis. The quality of the code ensured by the realized language and weaver has a positive impact both on maintenance effort and lead-time in the first line software development process. As evidence, we present a short evaluation of the language and weaver as applied today in the software development process of ASML
Semantic Analysis of Macro Usage for Portability
C is an unsafe language. Researchers have been developing tools to port C to
safer languages such as Rust, Checked C, or Go. Existing tools, however, resort
to preprocessing the source file first, then porting the resulting code,
leaving barely recognizable code that loses macro abstractions. To preserve
macro usage, porting tools need analyses that understand macro behavior to port
to equivalent constructs. But macro semantics differ from typical functions,
precluding simple syntactic transformations to port them. We introduce the
first comprehensive framework for analyzing the portability of macro usage. We
decompose macro behavior into 26 fine-grained properties and implement a
program analysis tool, called Maki, that identifies them in real-world code
with 94% accuracy. We apply Maki to 21 programs containing a total of 86,199
macro definitions. We found that real-world macros are much more portable than
previously known. More than a third (37%) are easy-to-port, and Maki provides
hints for porting more complicated macros. We find, on average, 2x more
easy-to-port macros and up to 7x more in the best case compared to prior work.
Guided by Maki's output, we found and hand-ported macros in four real-world
programs. We submitted patches to Linux maintainers that transform eleven
macros, nine of which have been accepted.Comment: 12 pages. 4 figures. 2 tables. To appear in the 2024 IEEE/ACM 46th
International Conference on Software Engineering (ICSE '24), April 14-20,
2024, Lisbon, Portugal. See https://zenodo.org/doi/10.5281/zenodo.7783131 for
the latest version of the artifact associated with this pape
- …