339 research outputs found

    Spam Classification Using Machine Learning Techniques - Sinespam

    Get PDF
    Most e-mail readers spend a non-trivial amount of time regularly deleting junk e-mail (spam) messages, even as an expanding volume of such e-mail occupies server storage space and consumes network bandwidth. An ongoing challenge, therefore, rests within the development and refinement of automatic classifiers that can distinguish legitimate e-mail from spam. Some published studies have examined spam detectors using Naïve Bayesian approaches and large feature sets of binary attributes that determine the existence of common keywords in spam, and many commercial applications also use Naïve Bayesian techniques. Spammers recognize these attempts to prevent their messages and have developed tactics to circumvent these filters, but these evasive tactics are themselves patterns that human readers can often identify quickly. This work had the objectives of developing an alternative approach using a neural network (NN) classifier brained on a corpus of e-mail messages from several users. The features selection used in this work is one of the major improvements, because the feature set uses descriptive characteristics of words and messages similar to those that a human reader would use to identify spam, and the model to select the best feature set, was based on forward feature selection. Another objective in this work was to improve the spam detection near 95% of accuracy using Artificial Neural Networks; actually nobody has reached more than 89% of accuracy using ANN

    Toward least-privilege isolation for software

    Get PDF
    Hackers leverage software vulnerabilities to disclose, tamper with, or destroy sensitive data. To protect sensitive data, programmers can adhere to the principle of least-privilege, which entails giving software the minimal privilege it needs to operate, which ensures that sensitive data is only available to software components on a strictly need-to-know basis. Unfortunately, applying this principle in practice is dif- �cult, as current operating systems tend to provide coarse-grained mechanisms for limiting privilege. Thus, most applications today run with greater-than-necessary privileges. We propose sthreads, a set of operating system primitives that allows �ne-grained isolation of software to approximate the least-privilege ideal. sthreads enforce a default-deny model, where software components have no privileges by default, so all privileges must be explicitly granted by the programmer. Experience introducing sthreads into previously monolithic applications|thus, partitioning them|reveals that enumerating privileges for sthreads is di�cult in practice. To ease the introduction of sthreads into existing code, we include Crowbar, a tool that can be used to learn the privileges required by a compartment. We show that only a few changes are necessary to existing code in order to partition applications with sthreads, and that Crowbar can guide the programmer through these changes. We show that applying sthreads to applications successfully narrows the attack surface by reducing the amount of code that can access sensitive data. Finally, we show that applications using sthreads pay only a small performance overhead. We applied sthreads to a range of applications. Most notably, an SSL web server, where we show that sthreads are powerful enough to protect sensitive data even against a strong adversary that can act as a man-in-the-middle in the network, and also exploit most code in the web server; a threat model not addressed to date

    Fast-Flux Bot Detection in Real Time

    Full text link
    Abstract. The fast-flux service network architecture has been widely adopted by bot herders to increase the productivity and extend the lifes-pan of botnets ’ domain names. A fast-flux botnet is unique in that each of its domain names is normally mapped to different sets of IP addresses over time and legitimate users ’ requests are handled by machines other than those contacted by users directly. Most existing methods for de-tecting fast-flux botnets rely on the former property. This approach is effective, but it requires a certain period of time, maybe a few days, before a conclusion can be drawn. In this paper, we propose a novel way to detect whether a web service is hosted by a fast-flux botnet in real time. The scheme is unique because it relies on certain intrinsic and invariant characteristics of fast-flux bot-nets, namely, 1) the request delegation model, 2) bots are not dedicated to malicious services, and 3) the hardware used by bots is normally infe-rior to that of dedicated servers. Our empirical evaluation results show that, using a passive measurement approach, the proposed scheme can detect fast-flux bots in a few seconds with more than 96 % accuracy, while the false positive/negative rates are both lower than 5%

    Exploiting Flow Relationships to Improve the Performance of Distributed Applications

    Get PDF
    Application performance continues to be an issue even with increased Internet bandwidth. There are many reasons for poor application performance including unpredictable network conditions, long round trip times, inadequate transmission mechanisms, or less than optimal application designs. In this work, we propose to exploit flow relationships as a general means to improve Internet application performance. We define a relationship to exist between two flows if the flows exhibit temporal proximity within the same scope, where a scope may either be between two hosts or between two clusters of hosts. Temporal proximity can either be in parallel or near-term sequential. As part of this work, we first observe that flow relationships are plentiful and they can be exploited to improve application performance. Second, we establish a framework on possible techniques to exploit flow relationships. In this framework, we summarize the improvements that can be brought by these techniques into several types and also use a taxonomy to break Internet applications into different categories based on their traffic characteristics and performance concerns. This approach allows us to investigate how a technique helps a group of applications rather than a particular one. Finally, we investigate several specific techniques under the framework and use them to illustrate how flow relationships are exploited to achieve a variety of improvements. We propose and evaluate a list of techniques including piggybacking related domain names, data piggybacking, enhanced TCP ACKs, packet aggregation, and critical packet piggybacking. We use them as examples to show how particular flow relationships can be used to improve applications in different ways such as reducing round trips, providing better quality of information, reducing the total number of packets, and avoiding timeouts. Results show that the technique of piggybacking related domain names can significantly reduce local cache misses and also reduce the same number of domain name messages. The data piggybacking technique can provide packet-efficient throughput in the reverse direction of a TCP connection without sacrificing forward throughput. The enhanced ACK approach provides more detailed and complete information about the state of the forward direction that could be used by a TCP implementation to obtain better throughput under different network conditions. Results for packet aggregation show only a marginal gain of packet savings due to the current traffic patterns. Finally, results for critical packet piggybacking demonstrate a big potential in using related flows to send duplicate copies to protect performance-critical packets from loss

    Collaborative, Trust-Based Security Mechanisms for a National Utility Intranet

    Get PDF
    This thesis investigates security mechanisms for utility control and protection networks using IP-based protocol interaction. It proposes flexible, cost-effective solutions in strategic locations to protect transitioning legacy and full IP-standards architectures. It also demonstrates how operational signatures can be defined to enact organizationally-unique standard operating procedures for zero failure in environments with varying levels of uncertainty and trust. The research evaluates layering encryption, authentication, traffic filtering, content checks, and event correlation mechanisms over time-critical primary and backup control/protection signaling to prevent disruption by internal and external malicious activity or errors. Finally, it shows how a regional/national implementation can protect private communities of interest and foster a mix of both centralized and distributed emergency prediction, mitigation, detection, and response with secure, automatic peer-to-peer notifications that share situational awareness across control, transmission, and reliability boundaries and prevent wide-spread, catastrophic power outages

    Automated IT Service Fault Diagnosis Based on Event Correlation Techniques

    Get PDF
    In the previous years a paradigm shift in the area of IT service management could be witnessed. IT management does not only deal with the network, end systems, or applications anymore, but is more and more concerned with IT services. This is caused by the need of organizations to monitor the efficiency of internal IT departments and to have the possibility to subscribe IT services from external providers. This trend has raised new challenges in the area of IT service management, especially with respect to service level agreements laying down the quality of service to be guaranteed by a service provider. Fault management is also facing new challenges which are related to ensuring the compliance to these service level agreements. For example, a high utilization of network links in the infrastructure can imply a delay increase in the delivery of services with respect to agreed time constraints. Such relationships have to be detected and treated in a service-oriented fault diagnosis which therefore does not deal with faults in a narrow sense, but with service quality degradations. This thesis aims at providing a concept for service fault diagnosis which is an important part of IT service fault management. At first, a motivation of the need of further examinations regarding this issue is given which is based on the analysis of services offered by a large IT service provider. A generalization of the scenario forms the basis for the specification of requirements which are used for a review of related research work and commercial products. Even though some solutions for particular challenges have already been provided, a general approach for service fault diagnosis is still missing. For addressing this issue, a framework is presented in the main part of this thesis using an event correlation component as its central part. Event correlation techniques which have been successfully applied to fault management in the area of network and systems management are adapted and extended accordingly. Guidelines for the application of the framework to a given scenario are provided afterwards. For showing their feasibility in a real world scenario, they are used for both example services referenced earlier

    A decentralized framework for cross administrative domain data sharing

    Get PDF
    Federation of messaging and storage platforms located in remote datacenters is an essential functionality to share data among geographically distributed platforms. When systems are administered by the same owner data replication reduces data access latency bringing data closer to applications and enables fault tolerance to face disaster recovery of an entire location. When storage platforms are administered by different owners data replication across different administrative domains is essential for enterprise application data integration. Contents and services managed by different software platforms need to be integrated to provide richer contents and services. Clients may need to share subsets of data in order to enable collaborative analysis and service integration. Platforms usually include proprietary federation functionalities and specific APIs to let external software and platforms access their internal data. These different techniques may not be applicable to all environments and networks due to security and technological restrictions. Moreover the federation of dispersed nodes under a decentralized administration scheme is still a research issue. This thesis is a contribution along this research direction as it introduces and describes a framework, called \u201cWideGroups\u201d, directed towards the creation and the management of an automatic federation and integration of widely dispersed platform nodes. It is based on groups to exchange messages among distributed applications located in different remote datacenters. Groups are created and managed using client side programmatic configuration without touching servers. WideGroups enables the extension of the software platform services to nodes belonging to different administrative domains in a wide area network environment. It lets different nodes form ad-hoc overlay networks on-the-fly depending on message destinations located in distinct administrative domains. It supports multiple dynamic overlay networks based on message groups, dynamic discovery of nodes and automatic setup of overlay networks among nodes with no server-side configuration. I designed and implemented platform connectors to integrate the framework as the federation module of Message Oriented Middleware and Key Value Store platforms, which are among the most widespread paradigms supporting data sharing in distributed systems
    • …
    corecore