Search CORE

3,117 research outputs found

Online inverse reinforcement learning with unknown disturbances

Author: ioannou
levine
levine
neu
ng
self
syed
ziebart
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/03/2020
Field of study

This paper addresses the problem of online inverse reinforcement learning for nonlinear systems with modeling uncertainties while in the presence of unknown disturbances. The developed approach observes state and input trajectories for an agent and identifies the unknown reward function online. Sub-optimality introduced in the observed trajectories by the unknown external disturbance is compensated for using a novel model-based inverse reinforcement learning approach. The observer estimates the external disturbances and uses the resulting estimates to learn the dynamic model of the demonstrator. The learned demonstrator model along with the observed suboptimal trajectories are used to implement inverse reinforcement learning. Theoretical guarantees are provided using Lyapunov theory and a simulation example is shown to demonstrate the effectiveness of the proposed technique.Comment: 8 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Unknown Piecewise Constant Parameters Identification with Exponential Rate of Convergence

Author: Glushchenko Anton
Lastochkin Konstantin
Publication venue
Publication date: 04/08/2022
Field of study

The scope of this research is the identification of unknown piecewise constant parameters of linear regression equation under the finite excitation condition. Compared to the known methods, to make the computational burden lower, only one model to identify all switching states of the regression is used in the developed procedure with the following two-fold contribution. First of all, we propose a new truly online estimation algorithm based on a well-known DREM approach to detect switching time and preserve time alertness with adjustable detection delay. Secondly, despite the fact that a switching signal function is unknown, the adaptive law is derived that provides global exponential convergence of the regression parameters to their true values in case the regressor is finitely exciting somewhere inside the time interval between two consecutive parameters switches. The robustness of the proposed identification procedure to the influence of external disturbances is analytically proved. Its effectiveness is demonstrated via numerical experiments, in which both abstract regressions and a second-order plant model are used.Comment: 31 pages, 12 figure

arXiv.org e-Print Archive

An Outline of a Proposed System that Learns from Experts How to Discharge Proof Obligations Automatically

Author: Bundy Alan
Grov Gudmund
Jones Cliff B.
Publication venue
Publication date: 01/01/2009
Field of study

Edinburgh Research Explorer

Online List Labeling with Predictions

Author: McCauley Samuel
Moseley Benjamin
Niaparast Aidin
Singh Shikha
Publication venue
Publication date: 20/06/2023
Field of study

A growing line of work shows how learned predictions can be used to break through worst-case barriers to improve the running time of an algorithm. However, incorporating predictions into data structures with strong theoretical guarantees remains underdeveloped. This paper takes a step in this direction by showing that predictions can be leveraged in the fundamental online list labeling problem. In the problem, n items arrive over time and must be stored in sorted order in an array of size Theta(n). The array slot of an element is its label and the goal is to maintain sorted order while minimizing the total number of elements moved (i.e., relabeled). We design a new list labeling data structure and bound its performance in two models. In the worst-case learning-augmented model, we give guarantees in terms of the error in the predictions. Our data structure provides strong guarantees: it is optimal for any prediction error and guarantees the best-known worst-case bound even when the predictions are entirely erroneous. We also consider a stochastic error model and bound the performance in terms of the expectation and variance of the error. Finally, the theoretical results are demonstrated empirically. In particular, we show that our data structure has strong performance on real temporal data sets where predictions are constructed from elements that arrived in the past, as is typically done in a practical use case

arXiv.org e-Print Archive

Mechanizing Webassembly Proposals

Author: Mischka Jacob Richard
Publication venue: UWM Digital Commons
Publication date: 01/08/2020
Field of study

WebAssembly is a modern low-level programming language designed to provide high performance and security. To enable these goals, the language specifies a relatively small number of low-level types, instructions, and language constructs. The language is proven to be sound with respect to its types and execution, and a separate mechanized formalization of the specification and type soundness proofs confirms this. As an emerging technology, the language is continuously being developed, with modifications being proposed and discussed in the open and on a frequent basis. In order to ensure the soundness properties exhibited by the original core language are maintained as WebAssembly evolves, these proposals should too be mechanized and verified to be sound. This work extends the existing Isabelle mechanization to include three such proposals which add additional features to the language, and shows that the language maintains its soundness properties with their inclusion

University of Wisconsin-Milwaukee

IST Austria Thesis

Author: Tarrach Thorsten
Publication venue: IST Austria
Publication date: 01/01/2016
Field of study

In this thesis we present a computer-aided programming approach to concurrency. Our approach helps the programmer by automatically fixing concurrency-related bugs, i.e. bugs that occur when the program is executed using an aggressive preemptive scheduler, but not when using a non-preemptive (cooperative) scheduler. Bugs are program behaviours that are incorrect w.r.t. a specification. We consider both user-provided explicit specifications in the form of assertion statements in the code as well as an implicit specification. The implicit specification is inferred from the non-preemptive behaviour. Let us consider sequences of calls that the program makes to an external interface. The implicit specification requires that any such sequence produced under a preemptive scheduler should be included in the set of sequences produced under a non-preemptive scheduler. We consider several semantics-preserving fixes that go beyond atomic sections typically explored in the synchronisation synthesis literature. Our synthesis is able to place locks, barriers and wait-signal statements and last, but not least reorder independent statements. The latter may be useful if a thread is released to early, e.g., before some initialisation is completed. We guarantee that our synthesis does not introduce deadlocks and that the synchronisation inserted is optimal w.r.t. a given objective function. We dub our solution trace-based synchronisation synthesis and it is loosely based on counterexample-guided inductive synthesis (CEGIS). The synthesis works by discovering a trace that is incorrect w.r.t. the specification and identifying ordering constraints crucial to trigger the specification violation. Synchronisation may be placed immediately (greedy approach) or delayed until all incorrect traces are found (non-greedy approach). For the non-greedy approach we construct a set of global constraints over synchronisation placements. Each model of the global constraints set corresponds to a correctness-ensuring synchronisation placement. The placement that is optimal w.r.t. the given objective function is chosen as the synchronisation solution. We evaluate our approach on a number of realistic (albeit simplified) Linux device-driver benchmarks. The benchmarks are versions of the drivers with known concurrency-related bugs. For the experiments with an explicit specification we added assertions that would detect the bugs in the experiments. Device drivers lend themselves to implicit specification, where the device and the operating system are the external interfaces. Our experiments demonstrate that our synthesis method is precise and efficient. We implemented objective functions for coarse-grained and fine-grained locking and observed that different synchronisation placements are produced for our experiments, favouring e.g. a minimal number of synchronisation operations or maximum concurrency

IST Austria: PubRep (Institute of Science and Technology)

Using system call analysis to stop evasion attacks in anomaly based Intrusion Detection System

Author: Samant Ashish Liladhar
Publication venue: RIT Scholar Works
Publication date: 01/01/2004
Field of study

Intrusion Detection Systems (IDSs) that operate on the principle of system call monitoring are known to be susceptible to mimicry or evasion attacks. It has been shown that an intelligent adversary armed with comprehensive knowledge of the target system or network, can penetrate these targets, hide his presence from the IDS, and continue to carry out damage. IDSs, which use system calls to define normal behavior, often leave out complimentary information about them, and intruders use precisely this drawback, to deceive the IDS. This thesis investigates the vulnerabilities of a system call based IDS and carries out a theoretical and experimental study of methods allowing to improve the IDS performance and reliability. It analyzes the design principles and architecture of anomaly based IDSs and studies the implementation of a typical system call based anomaly IDS. This category of anomaly detection systems is currently attracting considerable attention within the research community and various prototypes have been developed in recent years. The thesis investigates the hypothesis that by monitoring the number of system calls that fail and return error values on a per process basis, it would be possible to identify abnormally behaving processes. It also suggests that by using only a certain set of critical system calls instead of all the defined calls, it could be possible to detect and stop mimicry attacks. pH IDS is used for the purpose of the experiments as its source code is freely available. It works as a patch to the Linux kernel and alters the way system calls are handled. The tests were carried out on a stand-alone Linux box running RedHat 9 with kernel version 2.4.20. Local exploits, which were readily available on the Internet, were used in the experiments. Some of the results obtained contradicted our original hypothesis and are indicative of the scope for future work in this area. The tests revealed that it was not possible to simply use system call return values to identify erroneously behaving processes. However after classifying the system calls into critical and non-critical sets, a form of mimicry attacks could be successfully detected. The results confirm the potential of this technique to thwart evasion attacks and points to the direction of possible further work in this area

RIT Scholar Works