28 research outputs found
JVM-hosted languages: They talk the talk, but do they walk the walk?
The rapid adoption of non-Java JVM languages is impressive: major international corporations are staking critical parts of their software infrastructure on components built from languages such as
Scala and Clojure. However with the possible exception of Scala,
there has been little academic consideration and characterization
of these languages to date. In this paper, we examine four nonJava JVM languages and use exploratory data analysis techniques
to investigate differences in their dynamic behavior compared to
Java. We analyse a variety of programs and levels of behavior to
draw distinctions between the different programming languages.
We briefly discuss the implications of our findings for improving
the performance of JIT compilation and garbage collection on the
JVM platform
Dependability Metrics : Research Workshop Proceedings
Justifying reliance in computer systems is based on some form of evidence about such systems. This in turn implies the existence of scientific techniques to derive such evidence from given systems or predict such evidence of systems. In a general sense, these techniques imply a form of measurement. The workshop Dependability Metrics'', which was held on November 10, 2008, at the University of Mannheim, dealt with all aspects of measuring dependability
Identification of Malicious Android Applications using Kernel Level System Calls
With the advancement of technology, smartphones are gaining popularity by increasing their computational power and incorporating a large variety of new sensors and features that can be utilized by application developers in order to improve the user experience. On the other hand, this widespread use of smartphones and their increased capabilities have also attracted the attention of malware writers who shifted their focus from the desktop environment and started creating malware applications dedicated to smartphones. With about 1.5 million Android device activations per day and billions of application installation from the official Android market (Google Play), Android is becoming one of the most widely used operating systems for smartphones and tablets. Most of the threats for Android come from applications installed from third-party markets which lack proper mechanisms to detect malicious applications that can leak users' private information, send SMS to premium numbers, or get root access to the system.
In this thesis, our work is divided into two main components. In the first one, we provide a framework to perform off-line analysis of Android applications using static and dynamic analysis approaches. In the static analysis phase, we perform de-compilation of the analyzed application and extract the permissions from its ‘AndroidManifest’ file. Whereas in dynamic analysis, we execute the target application on an Android emulator where the ‘starce’ tool is used to hook the system calls on the ‘zygote’ process and record all the calls invoked by the application. The extracted features from both the static and dynamic analysis modules are then used to classify the tested applications using a variety of classification algorithms.
In the second part, our aim is to provide real time monitoring for the behavior of Android application and alert users to these applications that violate a predefined security policy by trying to access private information such as GPS locations and SMS related information. In order to achieve this, we use a loadable kernel module for tracking the kernel level system calls.
The effectiveness of the developed prototypes is confirmed by testing them on popular applications collected from F-Droid, and malware samples obtained from third party and the Android Malware Genome Project dataset
Improving Reproducibility in Smart Contract Research
The most popular smart contract-based blockchain platform at the moment is Ethereum.
Based on market value, it is the second-largest blockchain platform behind Bitcoin, with a steadily increasing market share. Ethereum smart contracts are used to secure billions of dollars worth of assets.
Source code for smart contracts must be examined for any potential flaws that could result in significant financial losses and damage trust because they cannot be modified after deployment.
A wide range of tools have been developed for this goal, and extensive literature on vulnerabilities and detection techniques on the subject above constantly keeps emerging.
The analysis, testing, and debugging of smart contracts through automated processes have also been the subject of extensive research.
Researchers have worked on the development of tools that can automatically detect and fix vulnerabilities in smart contracts, especially tools that rely on less explored methodologies, such as machine learning-based tools.
We provide details on our work on \slithersimil, a statistical addition to a static analyzer, as a data-driven endeavor to complement the existing security analysis methods of smart contracts.
\slithersimil~allows developers and auditors to check the similarity between the source code snippets of smart contracts written in Solidity and allows users to check smart contracts with a database of vulnerable smart contracts through the same mechanism of similarity checking in order to facilitate the discovery of security vulnerabilities in smart contracts.
However, such automated analysis tools typically need datasets for their training, testing, and validation phases; collecting such data for smart contracts is time-consuming.
Besides, it is difficult and time-consuming to replicate the findings of the majority of prior empirical studies or to contrast one's findings with those of others who have researched the above topics.
Research studies offer datasets that frequently come in the form of sparse datasets with minimal to no usage guidance.
Due to the fast-paced nature of the Ethereum ecosystem, the datasets available are often quickly outdated.
These are significant barriers to performing verifiable, reproducible research, as it takes a substantial amount of time to accomplish many subtasks such as locating, extracting, cleaning, and categorizing a reasonable amount of high-quality, heterogeneous smart contract data.
To address this issue, we introduce \etherbase, an extensible, queryable, and user-friendly database of smart contracts and their metrics that improve reproducibility and benchmarking in smart contract research
AXMEDIS 2008
The AXMEDIS International Conference series aims to explore all subjects and topics related to cross-media and digital-media content production, processing, management, standards, representation, sharing, protection and rights management, to address the latest developments and future trends of the technologies and their applications, impacts and exploitation. The AXMEDIS events offer venues for exchanging concepts, requirements, prototypes, research ideas, and findings which could contribute to academic research and also benefit business and industrial communities. In the Internet as well as in the digital era, cross-media production and distribution represent key developments and innovations that are fostered by emergent technologies to ensure better value for money while optimising productivity and market coverage
Deep specification mining
Singapore National Research Foundatio
Recommended from our members
Machine learning for executable code in software testing and verification
Software testing and verification are essential for keeping software systems reliable and safe to use. However, it requires significant manual effort to write and maintain code artifacts needed for testing and verification, i.e., tests and proofs. With the pressure for developing software in limited time, developers usually write tests and proofs much later than the code under test/verification, which leaves room for software bugs in unchecked code. Recent advances in machine learning (ML) models, especially large language models (LLMs), can help reduce manual effort for testing and verification. Namely, developers can benefit from ML models’ predictions to write tests and proofs faster. However, existing models understand and generate software code as natural language text, ignoring the unique property of software being executable. Software execution is the process of a computer reading and acting on software code. Our insight is that ML models can greatly benefit from software execution, e.g., by inspecting and simulating the execution process, to generate more accurate predictions. Integrating execution with ML models is important for generating tests and proofs because ML models using only syntax-level information do not perform well on these tasks. This dissertation presents the design and implementation of two execution-guided ML models to improve developers’ productivity in writing testing and verification code: TeCo for test completion and Roosterize for lemma naming. First, this dissertation introduces TeCo to aid developers in completing next statements when writing tests, a task we formalized as test completion. TeCo exploits code semantics extracted from test execution results (e.g., local variable types) and execution context (e.g., last called method). TeCo also reranks ML model’s predictions by executing the predicted statements, to prioritize functionally correct predictions. Compared to existing code completion models that use only syntax-level information (including LLMs trained on massive code dataset), TeCo improves the accuracy of test completion by 29%.
Second, this dissertation introduces Roosterize to suggest lemma names when developers write proofs using proof assistants, such as Coq. Consistent coding conventions are important as verification projects based on proof assistants become larger, but manually enforcing the conventions can be costly. Existing ML models for method naming, a similar task in other programming languages, extract and summarize information extracted from code tokens, which is not suitable for Coq where the lemma names should exhibit semantic meanings that are not explicit in code tokens. Roosterize leverages the execution representations of the lemma from various phases of the proof assistant’s execution, including syntax trees from the parser and elaborated terms from the kernel. Roosterize improves the accuracy of lemma naming by 39% compared to baselines. Our findings in this dissertation support that the integration with execution can effectively improve the accuracy of ML models for testing and verification, which enables developing trustworthy software with high-quality tests and proofs with less manual effort.Electrical and Computer Engineerin