2,408 research outputs found
A foundation for synthesising programming language semantics
Programming or scripting languages used in real-world systems are seldom designed
with a formal semantics in mind from the outset. Therefore, the first step for developing well-founded analysis tools for these systems is to reverse-engineer a formal
semantics. This can take months or years of effort.
Could we automate this process, at least partially? Though desirable, automatically reverse-engineering semantics rules from an implementation is very challenging,
as found by Krishnamurthi, Lerner and Elberty. They propose automatically learning
desugaring translation rules, mapping the language whose semantics we seek to a simplified, core version, whose semantics are much easier to write. The present thesis
contains an analysis of their challenge, as well as the first steps towards a solution.
Scaling methods with the size of the language is very difficult due to state space
explosion, so this thesis proposes an incremental approach to learning the translation
rules. I present a formalisation that both clarifies the informal description of the challenge by Krishnamurthi et al, and re-formulates the problem, shifting the focus to the
conditions for incremental learning. The central definition of the new formalisation is
the desugaring extension problem, i.e. extending a set of established translation rules
by synthesising new ones.
In a synthesis algorithm, the choice of search space is important and non-trivial,
as it needs to strike a good balance between expressiveness and efficiency. The rest
of the thesis focuses on defining search spaces for translation rules via typing rules.
Two prerequisites are required for comparing search spaces. The first is a series of
benchmarks, a set of source and target languages equipped with intended translation
rules between them. The second is an enumerative synthesis algorithm for efficiently
enumerating typed programs. I show how algebraic enumeration techniques can be applied to enumerating well-typed translation rules, and discuss the properties expected
from a type system for ensuring that typed programs be efficiently enumerable.
The thesis presents and empirically evaluates two search spaces. A baseline search
space yields the first practical solution to the challenge. The second search space is
based on a natural heuristic for translation rules, limiting the usage of variables so that
they are used exactly once. I present a linear type system designed to efficiently enumerate translation rules, where this heuristic is enforced. Through informal analysis
and empirical comparison to the baseline, I then show that using linear types can speed
up the synthesis of translation rules by an order of magnitude
Pristup specifikaciji i generisanju proizvodnih procesa zasnovan na inženjerstvu vođenom modelima
In this thesis, we present an approach to the production process specification and generation based on the model-driven paradigm, with the goal to increase the flexibility of factories and respond to the challenges that emerged in the era of Industry 4.0 more efficiently. To formally specify production processes and their variations in the Industry 4.0 environment, we created a novel domain-specific modeling language, whose models are machine-readable. The created language can be used to model production processes that can be independent of any production system, enabling process models to be used in different production systems, and process models used for the specific production system. To automatically transform production process models dependent on the specific production system into instructions that are to be executed by production system resources, we created an instruction generator. Also, we created generators for different manufacturing documentation, which automatically transform production process models into manufacturing documents of different types. The proposed approach, domain-specific modeling language, and software solution contribute to introducing factories into the digital transformation process. As factories must rapidly adapt to new products and their variations in the era of Industry 4.0, production must be dynamically led and instructions must be automatically sent to factory resources, depending on products that are to be created on the shop floor. The proposed approach contributes to the creation of such a dynamic environment in contemporary factories, as it allows to automatically generate instructions from process models and send them to resources for execution. Additionally, as there are numerous different products and their variations, keeping the required manufacturing documentation up to date becomes challenging, which can be done automatically by using the proposed approach and thus significantly lower process designers' time.У овој дисертацији представљен је приступ спецификацији и генерисању производних процеса заснован на инжењерству вођеном моделима, у циљу повећања флексибилности постројења у фабрикама и ефикаснијег разрешавања изазова који се појављују у ери Индустрије 4.0. За потребе формалне спецификације производних процеса и њихових варијација у амбијенту Индустрије 4.0, креиран је нови наменски језик, чије моделе рачунар може да обради на аутоматизован начин. Креирани језик има могућност моделовања производних процеса који могу бити независни од производних система и тиме употребљени у различитим постројењима или фабрикама, али и производних процеса који су специфични за одређени систем. Како би моделе производних процеса зависних од конкретног производног система било могуће на аутоматизован начин трансформисати у инструкције које ресурси производног система извршавају, креиран је генератор инструкција. Такође су креирани и генератори техничке документације, који на аутоматизован начин трансформишу моделе производних процеса у документе различитих типова. Употребом предложеног приступа, наменског језика и софтверског решења доприноси се увођењу фабрика у процес дигиталне трансформације. Како фабрике у ери Индустрије 4.0 морају брзо да се прилагоде новим производима и њиховим варијацијама, неопходно је динамички водити производњу и на аутоматизован начин слати инструкције ресурсима у фабрици, у зависности од производа који се креирају у конкретном постројењу. Тиме што је у предложеном приступу могуће из модела процеса аутоматизовано генерисати инструкције и послати их ресурсима, доприноси се креирању једног динамичког окружења у савременим фабрикама. Додатно, услед великог броја различитих производа и њихових варијација, постаје изазовно одржавати неопходну техничку документацију, што је у предложеном приступу могуће урадити на аутоматизован начин и тиме значајно уштедети време пројектаната процеса.U ovoj disertaciji predstavljen je pristup specifikaciji i generisanju proizvodnih procesa zasnovan na inženjerstvu vođenom modelima, u cilju povećanja fleksibilnosti postrojenja u fabrikama i efikasnijeg razrešavanja izazova koji se pojavljuju u eri Industrije 4.0. Za potrebe formalne specifikacije proizvodnih procesa i njihovih varijacija u ambijentu Industrije 4.0, kreiran je novi namenski jezik, čije modele računar može da obradi na automatizovan način. Kreirani jezik ima mogućnost modelovanja proizvodnih procesa koji mogu biti nezavisni od proizvodnih sistema i time upotrebljeni u različitim postrojenjima ili fabrikama, ali i proizvodnih procesa koji su specifični za određeni sistem. Kako bi modele proizvodnih procesa zavisnih od konkretnog proizvodnog sistema bilo moguće na automatizovan način transformisati u instrukcije koje resursi proizvodnog sistema izvršavaju, kreiran je generator instrukcija. Takođe su kreirani i generatori tehničke dokumentacije, koji na automatizovan način transformišu modele proizvodnih procesa u dokumente različitih tipova. Upotrebom predloženog pristupa, namenskog jezika i softverskog rešenja doprinosi se uvođenju fabrika u proces digitalne transformacije. Kako fabrike u eri Industrije 4.0 moraju brzo da se prilagode novim proizvodima i njihovim varijacijama, neophodno je dinamički voditi proizvodnju i na automatizovan način slati instrukcije resursima u fabrici, u zavisnosti od proizvoda koji se kreiraju u konkretnom postrojenju. Time što je u predloženom pristupu moguće iz modela procesa automatizovano generisati instrukcije i poslati ih resursima, doprinosi se kreiranju jednog dinamičkog okruženja u savremenim fabrikama. Dodatno, usled velikog broja različitih proizvoda i njihovih varijacija, postaje izazovno održavati neophodnu tehničku dokumentaciju, što je u predloženom pristupu moguće uraditi na automatizovan način i time značajno uštedeti vreme projektanata procesa
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Spectral Sparsification for Communication-Efficient Collaborative Rotation and Translation Estimation
We propose fast and communication-efficient optimization algorithms for
multi-robot rotation averaging and translation estimation problems that arise
from collaborative simultaneous localization and mapping (SLAM),
structure-from-motion (SfM), and camera network localization applications. Our
methods are based on theoretical relations between the Hessians of the
underlying Riemannian optimization problems and the Laplacians of suitably
weighted graphs. We leverage these results to design a collaborative solver in
which robots coordinate with a central server to perform approximate
second-order optimization, by solving a Laplacian system at each iteration.
Crucially, our algorithms permit robots to employ spectral sparsification to
sparsify intermediate dense matrices before communication, and hence provide a
mechanism to trade off accuracy with communication efficiency with provable
guarantees. We perform rigorous theoretical analysis of our methods and prove
that they enjoy (local) linear rate of convergence. Furthermore, we show that
our methods can be combined with graduated non-convexity to achieve
outlier-robust estimation. Extensive experiments on real-world SLAM and SfM
scenarios demonstrate the superior convergence rate and communication
efficiency of our methods.Comment: Revised extended technical report (37 pages, 15 figures, 6 tables
Secure storage systems for untrusted cloud environments
The cloud has become established for applications that need to be scalable and highly
available. However, moving data to data centers owned and operated by a third party,
i.e., the cloud provider, raises security concerns because a cloud provider could easily
access and manipulate the data or program flow, preventing the cloud from being
used for certain applications, like medical or financial.
Hardware vendors are addressing these concerns by developing Trusted Execution
Environments (TEEs) that make the CPU state and parts of memory inaccessible from
the host software. While TEEs protect the current execution state, they do not provide
security guarantees for data which does not fit nor reside in the protected memory
area, like network and persistent storage.
In this work, we aim to address TEEs’ limitations in three different ways, first we
provide the trust of TEEs to persistent storage, second we extend the trust to multiple
nodes in a network, and third we propose a compiler-based solution for accessing
heterogeneous memory regions. More specifically,
• SPEICHER extends the trust provided by TEEs to persistent storage. SPEICHER
implements a key-value interface. Its design is based on LSM data structures, but
extends them to provide confidentiality, integrity, and freshness for the stored
data. Thus, SPEICHER can prove to the client that the data has not been tampered
with by an attacker.
• AVOCADO is a distributed in-memory key-value store (KVS) that extends the
trust that TEEs provide across the network to multiple nodes, allowing KVSs to
scale beyond the boundaries of a single node. On each node, AVOCADO carefully
divides data between trusted memory and untrusted host memory, to maximize
the amount of data that can be stored on each node. AVOCADO leverages the
fact that we can model network attacks as crash-faults to trust other nodes with
a hardened ABD replication protocol.
• TOAST is based on the observation that modern high-performance systems
often use several different heterogeneous memory regions that are not easily
distinguishable by the programmer. The number of regions is increased by the
fact that TEEs divide memory into trusted and untrusted regions. TOAST is a
compiler-based approach to unify access to different heterogeneous memory
regions and provides programmability and portability. TOAST uses a
load/store interface to abstract most library interfaces for different memory
regions
Unveiling the frontiers of deep learning: innovations shaping diverse domains
Deep learning (DL) enables the development of computer models that are
capable of learning, visualizing, optimizing, refining, and predicting data. In
recent years, DL has been applied in a range of fields, including audio-visual
data processing, agriculture, transportation prediction, natural language,
biomedicine, disaster management, bioinformatics, drug design, genomics, face
recognition, and ecology. To explore the current state of deep learning, it is
necessary to investigate the latest developments and applications of deep
learning in these disciplines. However, the literature is lacking in exploring
the applications of deep learning in all potential sectors. This paper thus
extensively investigates the potential applications of deep learning across all
major fields of study as well as the associated benefits and challenges. As
evidenced in the literature, DL exhibits accuracy in prediction and analysis,
makes it a powerful computational tool, and has the ability to articulate
itself and optimize, making it effective in processing data with no prior
training. Given its independence from training data, deep learning necessitates
massive amounts of data for effective analysis and processing, much like data
volume. To handle the challenge of compiling huge amounts of medical,
scientific, healthcare, and environmental data for use in deep learning, gated
architectures like LSTMs and GRUs can be utilized. For multimodal learning,
shared neurons in the neural network for all activities and specialized neurons
for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table
Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5
This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered.
First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes.
Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification.
Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well
Automated and foundational verification of low-level programs
Formal verification is a promising technique to ensure the reliability of low-level programs like operating systems and hypervisors, since it can show the absence of whole classes of bugs and prevent critical vulnerabilities. However, to realize the full potential of formal verification for real-world low-level programs one has to overcome several challenges, including: (1) dealing with the complexities of realistic models of real-world programming languages; (2) ensuring the trustworthiness of the verification, ideally by providing foundational proofs (i.e., proofs that can be checked by a general-purpose proof assistant); and (3) minimizing the manual effort required for verification by providing a high degree of automation. This dissertation presents multiple projects that advance formal verification along these three axes: RefinedC provides the first approach for verifying C code that combines foundational proofs with a high degree of automation via a novel refinement and ownership type system. Islaris shows how to scale verification of assembly code to realistic models of modern instruction set architectures-in particular, Armv8-A and RISC-V. DimSum develops a decentralized approach for reasoning about programs that consist of components written in multiple different languages (e.g., assembly and C), as is common for low-level programs. RefinedC and Islaris rest on Lithium, a novel proof engine for separation logic that combines automation with foundational proofs.Formale Verifikation ist eine vielversprechende Technik, um die Verlässlichkeit von grundlegenden Programmen wie Betriebssystemen sicherzustellen. Um das volle Potenzial formaler Verifikation zu realisieren, müssen jedoch mehrere Herausforderungen gemeistert werden: Erstens muss die Komplexität von realistischen Modellen von Programmiersprachen wie C oder Assembler gehandhabt werden. Zweitens muss die Vertrauenswürdigkeit der Verifikation sichergestellt werden, idealerweise durch maschinenüberprüfbare Beweise. Drittens muss die Verifikation automatisiert werden, um den manuellen Aufwand zu minimieren. Diese Dissertation präsentiert mehrere Projekte, die formale Verifikation entlang dieser Achsen weiterentwickeln: RefinedC ist der erste Ansatz für die Verifikation von C Code, der maschinenüberprüfbare Beweise mit einem hohen Grad an Automatisierung vereint. Islaris zeigt, wie die Verifikation von Assembler zu realistischen Modellen von modernen Befehlssatzarchitekturen wie Armv8-A oder RISC-V skaliert werden kann. DimSum entwickelt einen neuen Ansatz für die Verifizierung von Programmen, die aus Komponenten in mehreren Programmiersprachen bestehen (z.B., C und Assembler), wie es oft bei grundlegenden Programmen wie Betriebssystemen der Fall ist. RefinedC und Islaris basieren auf Lithium, eine neue Automatisierungstechnik für Separationslogik, die maschinenüberprüfbare Beweise und Automatisierung verbindet.This research was supported in part by a Google PhD Fellowship, in part by awards from Android Security's ASPIRE program and from Google Research, and in part by a European Research Council (ERC) Consolidator Grant for the project "RustBelt", funded under the European Union’s Horizon 2020 Framework Programme (grant agreement no. 683289)
Computational Approaches to Drug Profiling and Drug-Protein Interactions
Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a
long period of stagnation in drug approvals. Due to the extreme costs associated with
introducing a drug to the market, locating and understanding the reasons for clinical failure
is key to future productivity. As part of this PhD, three main contributions were made in
this respect. First, the web platform, LigNFam enables users to interactively explore
similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly,
two deep-learning-based binding site comparison tools were developed, competing with
the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the
open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold
relationships and has already been used in multiple projects, including integration into a
virtual screening pipeline to increase the tractability of ultra-large screening experiments.
Together, and with existing tools, the contributions made will aid in the understanding of
drug-protein relationships, particularly in the fields of off-target prediction and drug
repurposing, helping to design better drugs faster
Coping with low data availability for social media crisis message categorisation
During crisis situations, social media allows people to quickly share
information, including messages requesting help. This can be valuable to
emergency responders, who need to categorise and prioritise these messages
based on the type of assistance being requested. However, the high volume of
messages makes it difficult to filter and prioritise them without the use of
computational techniques. Fully supervised filtering techniques for crisis
message categorisation typically require a large amount of annotated training
data, but this can be difficult to obtain during an ongoing crisis and is
expensive in terms of time and labour to create.
This thesis focuses on addressing the challenge of low data availability when
categorising crisis messages for emergency response. It first presents domain
adaptation as a solution for this problem, which involves learning a
categorisation model from annotated data from past crisis events (source
domain) and adapting it to categorise messages from an ongoing crisis event
(target domain). In many-to-many adaptation, where the model is trained on
multiple past events and adapted to multiple ongoing events, a multi-task
learning approach is proposed using pre-trained language models. This approach
outperforms baselines and an ensemble approach further improves performance..
- …