475 research outputs found
Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications
Wireless sensor networks monitor dynamic environments that change rapidly
over time. This dynamic behavior is either caused by external factors or
initiated by the system designers themselves. To adapt to such conditions,
sensor networks often adopt machine learning techniques to eliminate the need
for unnecessary redesign. Machine learning also inspires many practical
solutions that maximize resource utilization and prolong the lifespan of the
network. In this paper, we present an extensive literature review over the
period 2002-2013 of machine learning methods that were used to address common
issues in wireless sensor networks (WSNs). The advantages and disadvantages of
each proposed algorithm are evaluated against the corresponding problem. We
also provide a comparative guide to aid WSN designers in developing suitable
machine learning solutions for their specific application challenges.Comment: Accepted for publication in IEEE Communications Surveys and Tutorial
Statistical anomaly denial of service and reconnaissance intrusion detection
This dissertation presents the architecture, methods and results of the Hierarchical Intrusion Detection Engine (HIDE) and the Reconnaissance Intrusion Detection System (RIDS); the former is denial-of-service (DoS) attack detector while the latter is a scan and probe (P&S) reconnaissance detector; both are statistical anomaly systems.
The HIDE is a packet-oriented, observation-window using, hierarchical, multi-tier, anomaly based network intrusion detection system, which monitors several network traffic parameters simultaneously, constructs a 64-bin probability density function (PDF) for each, statistically compares it to a reference PDF of normal behavior using a similarity metric, then combines the results into an anomaly status vector that is classified by a neural network classifier. Three different data sets have been utilized to test the performance of HIDE; they are OPNET simulation data, DARPA\u2798 intrusion detection evaluation data and the CONEX TESTBED attack data. The results showed that HIDE can reliably detect DoS attacks with high accuracy and very low false alarm rates on all data sets. In particular, the investigation using the DARPA\u2798 data set yielded an overall total misclassification rate of 0.13%, false negative rate of 1.42%, and false positive rate of 0.090%; the latter implies a rate of only about 2.6 false alarms per day.
The RIDS is a session oriented, statistical tool, that relies on training to model the parameters of its algorithms, capable of detecting even distributed stealthy reconnaissance attacks. It consists of two main functional modules or stages: the Reconnaissance Activity Profiler (RAP) and the Reconnaissance Alert Correlater (RAC). The RAP is a session-oriented module capable of detecting stealthy scanning and probing attacks, while the RAG is an alert-correlation module that fuses the RAP alerts into attack scenarios and discovers the distributed stealthy attack scenarios. RIDS has been evaluated against two data sets: (a) the DARPA\u2798 data, and (b) 3 weeks of experimental data generated using the CONEX TESTBED network. The RIDS has demonstrably achieved remarkable success; the false positive, false negative and misclassification rates found are low, less than 0.1%, for most reconnaissance attacks; they rise to about 6% for distributed highly stealthy attacks; the latter is a most challenging type of attack, which has been difficult to detect effectively until now
Artificial Intelligence Research Branch future plans
This report contains information on the activities of the Artificial Intelligence Research Branch (FIA) at NASA Ames Research Center (ARC) in 1992, as well as planned work in 1993. These activities span a range from basic scientific research through engineering development to fielded NASA applications, particularly those applications that are enabled by basic research carried out in FIA. Work is conducted in-house and through collaborative partners in academia and industry. All of our work has research themes with a dual commitment to technical excellence and applicability to NASA short, medium, and long-term problems. FIA acts as the Agency's lead organization for research aspects of artificial intelligence, working closely with a second research laboratory at the Jet Propulsion Laboratory (JPL) and AI applications groups throughout all NASA centers. This report is organized along three major research themes: (1) Planning and Scheduling: deciding on a sequence of actions to achieve a set of complex goals and determining when to execute those actions and how to allocate resources to carry them out; (2) Machine Learning: techniques for forming theories about natural and man-made phenomena; and for improving the problem-solving performance of computational systems over time; and (3) Research on the acquisition, representation, and utilization of knowledge in support of diagnosis design of engineered systems and analysis of actual systems
Recommended from our members
MapReduce based RDF assisted distributed SVM for high throughput spam filtering
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityElectronic mail has become cast and embedded in our everyday lives. Billions of legitimate emails are sent on a daily basis. The widely established underlying infrastructure, its widespread availability as well as its ease of use have all acted as catalysts to such pervasive proliferation. Unfortunately, the same can be alleged about unsolicited bulk email, or rather spam. Various methods, as well as enabling architectures are available to try to mitigate spam permeation. In this respect, this dissertation compliments existing survey work in this area by contributing an extensive literature review of traditional and emerging spam filtering approaches. Techniques, approaches and architectures employed for spam filtering are appraised, critically assessing respective strengths and weaknesses.
Velocity, volume and variety are key characteristics of the spam challenge. MapReduce (M/R) has become increasingly popular as an Internet scale, data intensive processing platform. In the context of machine learning based spam filter training, support vector machine (SVM) based techniques have been proven effective. SVM training is however a computationally intensive process. In this dissertation, a M/R based distributed SVM algorithm for scalable spam filter training, designated MRSMO, is presented. By distributing and processing subsets of the training data across multiple participating computing nodes, the distributed SVM reduces spam filter training time significantly. To mitigate the accuracy degradation introduced by the adopted approach, a Resource Description Framework (RDF) based feedback loop is evaluated. Experimental results demonstrate that this improves the accuracy levels of the distributed SVM beyond the original sequential counterpart.
Effectively exploiting large scale, ‘Cloud’ based, heterogeneous processing capabilities for M/R in what can be considered a non-deterministic environment requires the consideration of a number of perspectives. In this work, gSched, a Hadoop M/R based, heterogeneous aware task to node matching and allocation scheme is designed. Using MRSMO as a baseline, experimental evaluation indicates that gSched improves on the performance of the out-of-the box Hadoop counterpart in a typical Cloud based infrastructure.
The focal contribution to knowledge is a scalable, heterogeneous infrastructure and machine learning based spam filtering scheme, able to capitalize on collaborative accuracy improvements through RDF based, end user feedback. MapReduce based RDF Assisted Distributed SVM for High Throughput Spam Filterin
Recommended from our members
High performance latent dirichlet allocation for text mining
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Latent Dirichlet Allocation (LDA), a total probability generative model, is a three-tier Bayesian model. LDA computes the latent topic structure of the data and obtains the significant information of documents. However, traditional LDA has several limitations in practical applications. LDA cannot be directly used in classification because it is a non-supervised learning model. It needs to be embedded into appropriate classification algorithms. LDA is a generative model as it normally generates the latent topics in the categories where the target documents do not belong to, producing the deviation in computation and reducing the classification accuracy. The number of topics in LDA influences the learning process of model parameters greatly. Noise samples in the training data also affect the final text classification result. And, the quality of LDA based classifiers depends on the quality of the training samples to a great extent. Although parallel LDA algorithms are proposed to deal with huge amounts of data, balancing computing loads in a computer cluster poses another challenge. This thesis presents a text classification method which combines the LDA model and Support Vector Machine (SVM) classification algorithm for an improved accuracy in classification when reducing the dimension of datasets. Based on Density-Based Spatial Clustering of Applications with Noise (DBSCAN), the algorithm automatically optimizes the number of topics to be selected which reduces the number of iterations in computation. Furthermore, this thesis presents a noise data reduction scheme to process noise data. When the noise ratio is large in the training data set, the noise reduction scheme can always produce a high level of accuracy in classification. Finally, the thesis parallelizes LDA using the MapReduce model which is the de facto computing standard in supporting data intensive applications. A genetic algorithm based load balancing algorithm is designed to balance the workloads among computers in a heterogeneous MapReduce cluster where the computers have a variety of computing resources in terms of CPU speed, memory space and hard disk space
Multi-signal Anomaly Detection for Real-Time Embedded Systems
This thesis presents MuSADET, an anomaly detection framework targeting timing anomalies found in event traces from real-time embedded systems. The method leverages stationary event generators, signal processing, and distance metrics to classify inter-arrival time sequences as normal/anomalous. Experimental evaluation of traces collected from two real-time embedded systems provides empirical evidence of MuSADET’s anomaly detection performance.
MuSADET is appropriate for embedded systems, where many event generators are intrinsically recurrent and generate stationary sequences of timestamp. To find timinganomalies, MuSADET compares the frequency domain features of an unknown trace to a normal model trained from well-behaved executions of the system. Each signal in the analysis trace receives a normal/anomalous score, which can help engineers isolate the source of the anomaly.
Empirical evidence of anomaly detection performed on traces collected from an industrygrade hexacopter and the Controller Area Network (CAN) bus deployed in a real vehicle demonstrates the feasibility of the proposed method. In all case studies, anomaly detection did not require an anomaly model while achieving high detection rates. For some of the studied scenarios, the true positive detection rate goes above 99 %, with false-positive rates below one %. The visualization of classification scores shows that some timing anomalies can propagate to multiple signals within the system. Comparison to the similar method, Signal Processing for Trace Analysis (SiPTA), indicates that MuSADET is superior in detection performance and provides complementary information that can help link anomalies to the process where they occurred
The 1995 Goddard Conference on Space Applications of Artificial Intelligence and Emerging Information Technologies
This publication comprises the papers presented at the 1995 Goddard Conference on Space Applications of Artificial Intelligence and Emerging Information Technologies held at the NASA/Goddard Space Flight Center, Greenbelt, Maryland, on May 9-11, 1995. The purpose of this annual conference is to provide a forum in which current research and development directed at space applications of artificial intelligence can be presented and discussed
- …