146 research outputs found

    Fast incremental learning of stochastic context-free grammars in radar electronic support

    Get PDF
    Radar Electronic Support (ES) involves the passive search for, interception, location, analysis and identification of radiated electromagnetic energy for military purposes. Although Stochastic Context-Free Grammars (SCFGs) appear promising for recognition of radar emitters, and for estimation of their respective level of threat in radar ES systems, the computations associated with well-known techniques for learning their production rule probabilities are very computationally demanding. The most popular methods for this task are the Inside-Outside (IO) algorithm, which maximizes the likelihood of a data set,and the Viterbi Score (VS) algorithm, which maximizes the likelihood of its best parse trees. For each iteration, their time complexity is cubic with the length of sequences in the training set and with the number of non-terminal symbols in the grammar. Since applications of radar ES require timely protection against threats, fast techniques for learning SCFGs probabilities are needed. Moreover, in radar ES applications, new information from a battlefield or other sources often becomes available at different points in time. In order to rapidly refiect changes in operational environments, fast incremental learning of SCFG probabilities is therefore an undisputed asset. Several techniques have been developed to accelerate the computation of production rules probabilities of SCFGs. In the first part of this thesis, three fast alternatives, called graphical EM (gEM), Tree Scanning (TS) and HOLA, are compared from several perspectives - perplexity, state estimation, ability to detect MFRs, time and memory complexity, and convergence time. Estimation of the average-case and worst-case execution time and storage requirements allows for the assessment of complexity, while computer simulations, performed using radar pulse data, facilitates the assessment of the other performance measures. An experimental protocol has been defined such that the impact on performance of factors like training set size and level of ambiguity of grammars may be observed. In addition, since VS is known to have a lower overall computational cost in practice, VS versions of the original 10-based gEM and TS have also been proposed and compared. Results indicate that both gEM(IO) and TS(IO) provide the same level of accuracy, yet the resource requirements mostly vary as a function of the ambiguity of the grammars. Furthermore, for a similar quality in results, the gEM(VS) and TS(VS) techniques provide significantly lower convergence times and time complexities per iteration in practice than do gEM(IO) and TS(IO). All of these algorithms may provide a greater level of accuracy than HOLA, yet their computational complexity may be orders of magnitude higher. Finally, HOLA is an on-line technique that naturally allows for incremental learning of production rule probabilities. In the second part of this thesis, two new incremental versions of gEM, called Incremental gEM (igEM) and On-Line Incremental gEM (oigEM), are proposed and compared to HOLA. They allow for a SCFG to efficiently learn new training sequences incrementally, without retraining from the start an all training data. An experimental protocol has been defined such that the impact on performance of factors like the size of new data blocks for incremental learning, and the level of ambiguity of MFR grammars, may be observed. Results indicate that, unlike HOLA, incremental learning of training data blocks with igEM and oigEM provides the same level of accuracy as learning from all cumulative data from scratch, even for relatively small data blocks. As expected, incremental leaming significantly reduces the overall time and memory complexities associated with updating SCFG probabilities. Finally, it appears that while the computational complexity and memory requirements of igEM and oigEM may be greater than that of HOLA, they both provide the higher level of accuracy

    Activity Analysis; Finding Explanations for Sets of Events

    Get PDF
    Automatic activity recognition is the computational process of analysing visual input and reasoning about detections to understand the performed events. In all but the simplest scenarios, an activity involves multiple interleaved events, some related and others independent. The activity in a car park or at a playground would typically include many events. This research assumes the possible events and any constraints between the events can be defined for the given scene. Analysing the activity should thus recognise a complete and consistent set of events; this is referred to as a global explanation of the activity. By seeking a global explanation that satisfies the activity’s constraints, infeasible interpretations can be avoided, and ambiguous observations may be resolved. An activity’s events and any natural constraints are defined using a grammar formalism. Attribute Multiset Grammars (AMG) are chosen because they allow defining hierarchies, as well as attribute rules and constraints. When used for recognition, detectors are employed to gather a set of detections. Parsing the set of detections by the AMG provides a global explanation. To find the best parse tree given a set of detections, a Bayesian network models the probability distribution over the space of possible parse trees. Heuristic and exhaustive search techniques are proposed to find the maximum a posteriori global explanation. The framework is tested for two activities: the activity in a bicycle rack, and around a building entrance. The first case study involves people locking bicycles onto a bicycle rack and picking them up later. The best global explanation for all detections gathered during the day resolves local ambiguities from occlusion or clutter. Intensive testing on 5 full days proved global analysis achieves higher recognition rates. The second case study tracks people and any objects they are carrying as they enter and exit a building entrance. A complete sequence of the person entering and exiting multiple times is recovered by the global explanation

    Proceedings of the international conference on cooperative multimodal communication CMC/95, Eindhoven, May 24-26, 1995:proceedings

    Get PDF

    Designing Statistical Language Learners: Experiments on Noun Compounds

    Full text link
    The goal of this thesis is to advance the exploration of the statistical language learning design space. In pursuit of that goal, the thesis makes two main theoretical contributions: (i) it identifies a new class of designs by specifying an architecture for natural language analysis in which probabilities are given to semantic forms rather than to more superficial linguistic elements; and (ii) it explores the development of a mathematical theory to predict the expected accuracy of statistical language learning systems in terms of the volume of data used to train them. The theoretical work is illustrated by applying statistical language learning designs to the analysis of noun compounds. Both syntactic and semantic analysis of noun compounds are attempted using the proposed architecture. Empirical comparisons demonstrate that the proposed syntactic model is significantly better than those previously suggested, approaching the performance of human judges on the same task, and that the proposed semantic model, the first statistical approach to this problem, exhibits significantly better accuracy than the baseline strategy. These results suggest that the new class of designs identified is a promising one. The experiments also serve to highlight the need for a widely applicable theory of data requirements.Comment: PhD thesis (Macquarie University, Sydney; December 1995), LaTeX source, xii+214 page

    Proceedings of the 2004 ONR Decision-Support Workshop Series: Interoperability

    Get PDF
    In August of 1998 the Collaborative Agent Design Research Center (CADRC) of the California Polytechnic State University in San Luis Obispo (Cal Poly), approached Dr. Phillip Abraham of the Office of Naval Research (ONR) with the proposal for an annual workshop focusing on emerging concepts in decision-support systems for military applications. The proposal was considered timely by the ONR Logistics Program Office for at least two reasons. First, rapid advances in information systems technology over the past decade had produced distributed collaborative computer-assistance capabilities with profound potential for providing meaningful support to military decision makers. Indeed, some systems based on these new capabilities such as the Integrated Marine Multi-Agent Command and Control System (IMMACCS) and the Integrated Computerized Deployment System (ICODES) had already reached the field-testing and final product stages, respectively. Second, over the past two decades the US Navy and Marine Corps had been increasingly challenged by missions demanding the rapid deployment of forces into hostile or devastate dterritories with minimum or non-existent indigenous support capabilities. Under these conditions Marine Corps forces had to rely mostly, if not entirely, on sea-based support and sustainment operations. Particularly today, operational strategies such as Operational Maneuver From The Sea (OMFTS) and Sea To Objective Maneuver (STOM) are very much in need of intelligent, near real-time and adaptive decision-support tools to assist military commanders and their staff under conditions of rapid change and overwhelming data loads. In the light of these developments the Logistics Program Office of ONR considered it timely to provide an annual forum for the interchange of ideas, needs and concepts that would address the decision-support requirements and opportunities in combined Navy and Marine Corps sea-based warfare and humanitarian relief operations. The first ONR Workshop was held April 20-22, 1999 at the Embassy Suites Hotel in San Luis Obispo, California. It focused on advances in technology with particular emphasis on an emerging family of powerful computer-based tools, and concluded that the most able members of this family of tools appear to be computer-based agents that are capable of communicating within a virtual environment of the real world. From 2001 onward the venue of the Workshop moved from the West Coast to Washington, and in 2003 the sponsorship was taken over by ONR’s Littoral Combat/Power Projection (FNC) Program Office (Program Manager: Mr. Barry Blumenthal). Themes and keynote speakers of past Workshops have included: 1999: ‘Collaborative Decision Making Tools’ Vadm Jerry Tuttle (USN Ret.); LtGen Paul Van Riper (USMC Ret.);Radm Leland Kollmorgen (USN Ret.); and, Dr. Gary Klein (KleinAssociates) 2000: ‘The Human-Computer Partnership in Decision-Support’ Dr. Ronald DeMarco (Associate Technical Director, ONR); Radm CharlesMunns; Col Robert Schmidle; and, Col Ray Cole (USMC Ret.) 2001: ‘Continuing the Revolution in Military Affairs’ Mr. Andrew Marshall (Director, Office of Net Assessment, OSD); and,Radm Jay M. Cohen (Chief of Naval Research, ONR) 2002: ‘Transformation ... ’ Vadm Jerry Tuttle (USN Ret.); and, Steve Cooper (CIO, Office ofHomeland Security) 2003: ‘Developing the New Infostructure’ Richard P. Lee (Assistant Deputy Under Secretary, OSD); and, MichaelO’Neil (Boeing) 2004: ‘Interoperability’ MajGen Bradley M. Lott (USMC), Deputy Commanding General, Marine Corps Combat Development Command; Donald Diggs, Director, C2 Policy, OASD (NII

    Advances in Computer Science and Engineering

    Get PDF
    The book Advances in Computer Science and Engineering constitutes the revised selection of 23 chapters written by scientists and researchers from all over the world. The chapters cover topics in the scientific fields of Applied Computing Techniques, Innovations in Mechanical Engineering, Electrical Engineering and Applications and Advances in Applied Modeling

    Acta Cybernetica : Volume 21. Number 3.

    Get PDF

    A pattern-driven corpus to predictive analytics in mitigating SQL injection attack

    Get PDF
    The back-end database provides accessible and structured storage for each web application’s big data internet web traffic exchanges stemming from cloud-hosted web applications to the Internet of Things (IoT) smart devices in emerging computing. Structured Query Language Injection Attack (SQLIA) remains an intruder’s exploit of choice to steal confidential information from the database of vulnerable front-end web applications with potentially damaging security ramifications.Existing solutions to SQLIA still follows the on-premise web applications server hosting concept which were primarily developed before the recent challenges of the big data mining and as such lack the functionality and ability to cope with new attack signatures concealed in a large volume of web requests. Also, most organisations’ databases and services infrastructure no longer reside on-premise as internet cloud-hosted applications and services are increasingly used which limit existing Structured Query Language Injection (SQLI) detection and prevention approaches that rely on source code scanning. A bio-inspired approach such as Machine Learning (ML) predictive analytics provides functional and scalable mining for big data in the detection and prevention of SQLI in intercepting large volumes of web requests. Unfortunately, lack of availability of robust ready-made data set with patterns and historical data items to train a classifier are issues well known in SQLIA research applying ML in the field of Artificial Intelligence (AI). The purpose-built competition-driven test case data sets are antiquated and not pattern-driven to train a classifier for real-world application. Also, the web application types are so diverse to have an all-purpose generic data set for ML SQLIA mitigation.This thesis addresses the lack of pattern-driven data set by deriving one to predict SQLIA of any size and proposing a technique to obtain a data set on the fly and break the circle of relying on few outdated competitions-driven data sets which exist are not meant to benchmark real-world SQLIA mitigation. The thesis in its contributions derived pattern-driven data set of related member strings that are used in training a supervised learning model with validation through Receiver Operating Characteristic (ROC) curve and Confusion Matrix (CM) with results of low false positives and negatives. We further the evaluations with cross-validation to have obtained a low variance in accuracy that indicates of a successful trained model using the derived pattern-driven data set capable of generalisation of unknown data in the real-world with reduced biases. Also, we demonstrated a proof of concept with a test application by implementing an ML Predictive Analytics to SQLIA detection and prevention using this pattern-driven data set in a test web application. We observed in the experiments carried out in the course of this thesis, a data set of related member strings can be generated from a web expected input data and SQL tokens, including known SQLI signatures. The data set extraction ontology proposed in this thesis for applied ML in SQLIA mitigation in the context of emerging computing of big data internet, and cloud-hosted services set our proposal apart from existing approaches that were mostly on-premise source code scanning and queries structure comparisons of some sort
    • …
    corecore