4,753 research outputs found
Determining the Limits of Automated Program Recognition
This working paper was submitted as a Ph.D. thesis proposal.Program recognition is a program understanding technique in which stereotypic computational structures are identified in a program. From this identification and the known relationships between the structures, a hierarchical description of the program's design is recovered. The feasibility of this technique for small programs has been shown by several researchers. However, it seems unlikely that the existing program recognition systems will scale up to realistic, full-sized programs without some guidance (e.g., from a person using the recognition system as an assistant). One reason is that there are limits to what can be recovered by a purely code-driven approach. Some of the information about the program that is useful to know for common software engineering tasks, particularly maintenance, is missing from the code. Another reason guidance must be provided is to reduce the cost of recognition. To determine what guidance is appropriate, therefore, we must know what information is recoverable from the code and where the complexity of program recognition lies. I propose to study the limits of program recognition, both empirically and analytically. First, I will build an experimental system that performs recognition on realistic programs on the order of thousands of lines. This will allow me to characterize the information that can be recovered by this code-driven technique. Second, I will formally analyze the complexity of the recognition process. This will help determine how guidance can be applied most profitably to improve the efficiency of program recognition.MIT Artificial Intelligence Laborator
Towards the Model-Driven Engineering of Secure yet Safe Embedded Systems
We introduce SysML-Sec, a SysML-based Model-Driven Engineering environment
aimed at fostering the collaboration between system designers and security
experts at all methodological stages of the development of an embedded system.
A central issue in the design of an embedded system is the definition of the
hardware/software partitioning of the architecture of the system, which should
take place as early as possible. SysML-Sec aims to extend the relevance of this
analysis through the integration of security requirements and threats. In
particular, we propose an agile methodology whose aim is to assess early on the
impact of the security requirements and of the security mechanisms designed to
satisfy them over the safety of the system. Security concerns are captured in a
component-centric manner through existing SysML diagrams with only minimal
extensions. After the requirements captured are derived into security and
cryptographic mechanisms, security properties can be formally verified over
this design. To perform the latter, model transformation techniques are
implemented in the SysML-Sec toolchain in order to derive a ProVerif
specification from the SysML models. An automotive firmware flashing procedure
serves as a guiding example throughout our presentation.Comment: In Proceedings GraMSec 2014, arXiv:1404.163
Copy mechanism and tailored training for character-based data-to-text generation
In the last few years, many different methods have been focusing on using
deep recurrent neural networks for natural language generation. The most widely
used sequence-to-sequence neural methods are word-based: as such, they need a
pre-processing step called delexicalization (conversely, relexicalization) to
deal with uncommon or unknown words. These forms of processing, however, give
rise to models that depend on the vocabulary used and are not completely
neural.
In this work, we present an end-to-end sequence-to-sequence model with
attention mechanism which reads and generates at a character level, no longer
requiring delexicalization, tokenization, nor even lowercasing. Moreover, since
characters constitute the common "building blocks" of every text, it also
allows a more general approach to text generation, enabling the possibility to
exploit transfer learning for training. These skills are obtained thanks to two
major features: (i) the possibility to alternate between the standard
generation mechanism and a copy one, which allows to directly copy input facts
to produce outputs, and (ii) the use of an original training pipeline that
further improves the quality of the generated texts.
We also introduce a new dataset called E2E+, designed to highlight the
copying capabilities of character-based models, that is a modified version of
the well-known E2E dataset used in the E2E Challenge. We tested our model
according to five broadly accepted metrics (including the widely used BLEU),
showing that it yields competitive performance with respect to both
character-based and word-based approaches.Comment: ECML-PKDD 2019 (Camera ready version
EdgeCalib: Multi-Frame Weighted Edge Features for Automatic Targetless LiDAR-Camera Calibration
In multimodal perception systems, achieving precise extrinsic calibration
between LiDAR and camera is of critical importance. Previous calibration
methods often required specific targets or manual adjustments, making them both
labor-intensive and costly. Online calibration methods based on features have
been proposed, but these methods encounter challenges such as imprecise feature
extraction, unreliable cross-modality associations, and high scene-specific
requirements. To address this, we introduce an edge-based approach for
automatic online calibration of LiDAR and cameras in real-world scenarios. The
edge features, which are prevalent in various environments, are aligned in both
images and point clouds to determine the extrinsic parameters. Specifically,
stable and robust image edge features are extracted using a SAM-based method
and the edge features extracted from the point cloud are weighted through a
multi-frame weighting strategy for feature filtering. Finally, accurate
extrinsic parameters are optimized based on edge correspondence constraints. We
conducted evaluations on both the KITTI dataset and our dataset. The results
show a state-of-the-art rotation accuracy of 0.086{\deg} and a translation
accuracy of 0.977 cm, outperforming existing edge-based calibration methods in
both precision and robustness
- …