14,563 research outputs found
HyBIS: Windows Guest Protection through Advanced Memory Introspection
Effectively protecting the Windows OS is a challenging task, since most
implementation details are not publicly known. Windows has always been the main
target of malwares that have exploited numerous bugs and vulnerabilities.
Recent trusted boot and additional integrity checks have rendered the Windows
OS less vulnerable to kernel-level rootkits. Nevertheless, guest Windows
Virtual Machines are becoming an increasingly interesting attack target. In
this work we introduce and analyze a novel Hypervisor-Based Introspection
System (HyBIS) we developed for protecting Windows OSes from malware and
rootkits. The HyBIS architecture is motivated and detailed, while targeted
experimental results show its effectiveness. Comparison with related work
highlights main HyBIS advantages such as: effective semantic introspection,
support for 64-bit architectures and for latest Windows (8.x and 10), advanced
malware disabling capabilities. We believe the research effort reported here
will pave the way to further advances in the security of Windows OSes
An Empirical Study on Android for Saving Non-shared Data on Public Storage
With millions of apps that can be downloaded from official or third-party
market, Android has become one of the most popular mobile platforms today.
These apps help people in all kinds of ways and thus have access to lots of
user's data that in general fall into three categories: sensitive data, data to
be shared with other apps, and non-sensitive data not to be shared with others.
For the first and second type of data, Android has provided very good storage
models: an app's private sensitive data are saved to its private folder that
can only be access by the app itself, and the data to be shared are saved to
public storage (either the external SD card or the emulated SD card area on
internal FLASH memory). But for the last type, i.e., an app's non-sensitive and
non-shared data, there is a big problem in Android's current storage model
which essentially encourages an app to save its non-sensitive data to shared
public storage that can be accessed by other apps. At first glance, it seems no
problem to do so, as those data are non-sensitive after all, but it implicitly
assumes that app developers could correctly identify all sensitive data and
prevent all possible information leakage from private-but-non-sensitive data.
In this paper, we will demonstrate that this is an invalid assumption with a
thorough survey on information leaks of those apps that had followed Android's
recommended storage model for non-sensitive data. Our studies showed that
highly sensitive information from billions of users can be easily hacked by
exploiting the mentioned problematic storage model. Although our empirical
studies are based on a limited set of apps, the identified problems are never
isolated or accidental bugs of those apps being investigated. On the contrary,
the problem is rooted from the vulnerable storage model recommended by Android.
To mitigate the threat, we also propose a defense framework
On the Effect of Semantically Enriched Context Models on Software Modularization
Many of the existing approaches for program comprehension rely on the
linguistic information found in source code, such as identifier names and
comments. Semantic clustering is one such technique for modularization of the
system that relies on the informal semantics of the program, encoded in the
vocabulary used in the source code. Treating the source code as a collection of
tokens loses the semantic information embedded within the identifiers. We try
to overcome this problem by introducing context models for source code
identifiers to obtain a semantic kernel, which can be used for both deriving
the topics that run through the system as well as their clustering. In the
first model, we abstract an identifier to its type representation and build on
this notion of context to construct contextual vector representation of the
source code. The second notion of context is defined based on the flow of data
between identifiers to represent a module as a dependency graph where the nodes
correspond to identifiers and the edges represent the data dependencies between
pairs of identifiers. We have applied our approach to 10 medium-sized open
source Java projects, and show that by introducing contexts for identifiers,
the quality of the modularization of the software systems is improved. Both of
the context models give results that are superior to the plain vector
representation of documents. In some cases, the authoritativeness of
decompositions is improved by 67%. Furthermore, a more detailed evaluation of
our approach on JEdit, an open source editor, demonstrates that inferred topics
through performing topic analysis on the contextual representations are more
meaningful compared to the plain representation of the documents. The proposed
approach in introducing a context model for source code identifiers paves the
way for building tools that support developers in program comprehension tasks
such as application and domain concept location, software modularization and
topic analysis
Interaction-aware development environments: recording, mining, and leveraging IDE interactions to analyze and support the development flow
Nowadays, software development is largely carried out using Integrated Development Environments, or IDEs. An IDE is a collection of tools and facilities to support the most diverse software engineering activities, such as writing code, debugging, and program understanding. The fact that they are integrated enables developers to find all the tools needed for the development in the same place. Each activity is composed of many basic events, such as clicking on a menu item in the IDE, opening a new user interface to browse the source code of a method, or adding a new statement in the body of a method. While working, developers generate thousands of these interactions, that we call fine-grained IDE interaction data. We believe this data is a valuable source of information that can be leveraged to enable better analyses and to offer novel support to developers. However, this data is largely neglected by modern IDEs. In this dissertation we propose the concept of "Interaction-Aware Development Environments": IDEs that collect, mine, and leverage the interactions of developers to support and simplify their workflow. We formulate our thesis as follows: Interaction-Aware Development Environments enable novel and in- depth analyses of the behavior of software developers and set the ground to provide developers with effective and actionable support for their activities inside the IDE. For example, by monitoring how developers navigate source code, the IDE could suggest the program entities that are potentially relevant for a particular task. Our research focuses on three main directions: 1. Modeling and Persisting Interaction Data. The first step to make IDEs aware of interaction data is to overcome its ephemeral nature. To do so we have to model this new source of data and to persist it, making it available for further use. 2. Interpreting Interaction Data. One of the biggest challenges of our research is making sense of the millions of interactions generated by developers. We propose several models to interpret this data, for example, by reconstructing high-level development activities from interaction histories or measure the navigation efficiency of developers. 3. Supporting Developers with Interaction Data. Novel IDEs can use the potential of interaction data to support software development. For example, they can identify the UI components that are potentially unnecessary for the future and suggest developers to close them, reducing the visual cluttering of the IDE
An Empirical Investigation for Understanding
While working on modernization of large monolithic application; speed , synchronization and interaction with other components are the major concern for practical implementation of target system; as Service-Oriented Computing extends and covering many sections of monolithic legacy to web oriented development, these aspects becoming a new challenges to existing software engineering practices, the paper presents work which is undertaken for service orientation of monolithic legacy application including initial steps of service understanding, comprehension and extraction so that it can take a part in further migration activities to service oriented architecture platform. The work also shows that how several useful techniques can be applied to accomplish the result
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
Recommended from our members
Combining Static and Dynamic Analysis for Bug Detection and Program Understanding
This work proposes new combinations of static and dynamic analysis for bug detection and program understanding. There are 3 related but largely independent directions: a) In the area of dynamic invariant inference, we improve the consistency of dynamically discovered invariants by taking into account second-order constraints that encode knowledge aboutinvariants; the second-order constraints are either supplied by the programmer or vetted by the programmer (among candidate constraints suggested automatically); b) In the area of testing dataflow (esp. map-reduce) programs, our tool, SEDGE, achieves higher testing coverage by leveraging existinginput data and generalizing them using a symbolic reasoning engine (a powerful SMT solver); c) In the area of bug detection, we identify and present the concept of residual investigation: a dynamic analysis that serves as theruntime agent of a static analysis. Residual investigation identifies with higher certainty whether an error reported by the static analysis is likely true
- …