16,433 research outputs found

    Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data

    Get PDF
    Recent years have seen the rise of more sophisticated attacks including advanced persistent threats (APTs) which pose severe risks to organizations and governments by targeting confidential proprietary information. Additionally, new malware strains are appearing at a higher rate than ever before. Since many of these malware are designed to evade existing security products, traditional defenses deployed by most enterprises today, e.g., anti-virus, firewalls, intrusion detection systems, often fail at detecting infections at an early stage. We address the problem of detecting early-stage infection in an enterprise setting by proposing a new framework based on belief propagation inspired from graph theory. Belief propagation can be used either with "seeds" of compromised hosts or malicious domains (provided by the enterprise security operation center -- SOC) or without any seeds. In the latter case we develop a detector of C&C communication particularly tailored to enterprises which can detect a stealthy compromise of only a single host communicating with the C&C server. We demonstrate that our techniques perform well on detecting enterprise infections. We achieve high accuracy with low false detection and false negative rates on two months of anonymized DNS logs released by Los Alamos National Lab (LANL), which include APT infection attacks simulated by LANL domain experts. We also apply our algorithms to 38TB of real-world web proxy logs collected at the border of a large enterprise. Through careful manual investigation in collaboration with the enterprise SOC, we show that our techniques identified hundreds of malicious domains overlooked by state-of-the-art security products

    On the Feasibility of Automated Detection of Allusive Text Reuse

    Full text link
    The detection of allusive text reuse is particularly challenging due to the sparse evidence on which allusive references rely---commonly based on none or very few shared words. Arguably, lexical semantics can be resorted to since uncovering semantic relations between words has the potential to increase the support underlying the allusion and alleviate the lexical sparsity. A further obstacle is the lack of evaluation benchmark corpora, largely due to the highly interpretative character of the annotation process. In the present paper, we aim to elucidate the feasibility of automated allusion detection. We approach the matter from an Information Retrieval perspective in which referencing texts act as queries and referenced texts as relevant documents to be retrieved, and estimate the difficulty of benchmark corpus compilation by a novel inter-annotator agreement study on query segmentation. Furthermore, we investigate to what extent the integration of lexical semantic information derived from distributional models and ontologies can aid retrieving cases of allusive reuse. The results show that (i) despite low agreement scores, using manual queries considerably improves retrieval performance with respect to a windowing approach, and that (ii) retrieval performance can be moderately boosted with distributional semantics

    A Framework for Dynamic Web Services Composition

    Get PDF
    Dynamic composition of web services is a promising approach and at the same time a challenging research area for the dissemination of service-oriented applications. It is widely recognised that service semantics is a key element for the dynamic composition of Web services, since it allows the unambiguous descriptions of a service's capabilities and parameters. This paper introduces a framework for performing dynamic service composition by exploiting the semantic matchmaking between service parameters (i.e., outputs and inputs) to enable their interconnection and interaction. The basic assumption of the framework is that matchmaking enables finding semantic compatibilities among independently defined service descriptions. We also developed a composition algorithm that follows a semantic graph-based approach, in which a graph represents service compositions and the nodes of this graph represent semantic connections between services. Moreover, functional and non-functional properties of services are considered, to enable the computation of relevant and most suitable service compositions for some service request. The suggested end-to-end functional level service composition framework is illustrated with a realistic application scenario from the IST SPICE project

    Ranking relations using analogies in biological and information networks

    Get PDF
    Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects S={A(1):B(1),A(2):B(2),ā€¦,A(N):B(N)}\mathbf{S}=\{A^{(1)}:B^{(1)},A^{(2)}:B^{(2)},\ldots,A^{(N)}:B ^{(N)}\}, measures how well other pairs A:B fit in with the set S\mathbf{S}. Our work addresses the following question: is the relation between objects A and B analogous to those relations found in S\mathbf{S}? Such questions are particularly relevant in information retrieval, where an investigator might want to search for analogous pairs of objects that match the query set of interest. There are many ways in which objects can be related, making the task of measuring analogies very challenging. Our approach combines a similarity measure on function spaces with Bayesian analysis to produce a ranking. It requires data containing features of the objects of interest and a link matrix specifying which relationships exist; no further attributes of such relationships are necessary. We illustrate the potential of our method on text analysis and information networks. An application on discovering functional interactions between pairs of proteins is discussed in detail, where we show that our approach can work in practice even if a small set of protein pairs is provided.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS321 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • ā€¦
    corecore