7,797 research outputs found

    Graphs in machine learning: an introduction

    Full text link
    Graphs are commonly used to characterise interactions between objects of interest. Because they are based on a straightforward formalism, they are used in many scientific fields from computer science to historical sciences. In this paper, we give an introduction to some methods relying on graphs for learning. This includes both unsupervised and supervised methods. Unsupervised learning algorithms usually aim at visualising graphs in latent spaces and/or clustering the nodes. Both focus on extracting knowledge from graph topologies. While most existing techniques are only applicable to static graphs, where edges do not evolve through time, recent developments have shown that they could be extended to deal with evolving networks. In a supervised context, one generally aims at inferring labels or numerical values attached to nodes using both the graph and, when they are available, node characteristics. Balancing the two sources of information can be challenging, especially as they can disagree locally or globally. In both contexts, supervised and un-supervised, data can be relational (augmented with one or several global graphs) as described above, or graph valued. In this latter case, each object of interest is given as a full graph (possibly completed by other characteristics). In this context, natural tasks include graph clustering (as in producing clusters of graphs rather than clusters of nodes in a single graph), graph classification, etc. 1 Real networks One of the first practical studies on graphs can be dated back to the original work of Moreno [51] in the 30s. Since then, there has been a growing interest in graph analysis associated with strong developments in the modelling and the processing of these data. Graphs are now used in many scientific fields. In Biology [54, 2, 7], for instance, metabolic networks can describe pathways of biochemical reactions [41], while in social sciences networks are used to represent relation ties between actors [66, 56, 36, 34]. Other examples include powergrids [71] and the web [75]. Recently, networks have also been considered in other areas such as geography [22] and history [59, 39]. In machine learning, networks are seen as powerful tools to model problems in order to extract information from data and for prediction purposes. This is the object of this paper. For more complete surveys, we refer to [28, 62, 49, 45]. In this section, we introduce notations and highlight properties shared by most real networks. In Section 2, we then consider methods aiming at extracting information from a unique network. We will particularly focus on clustering methods where the goal is to find clusters of vertices. Finally, in Section 3, techniques that take a series of networks into account, where each network i

    Applying the Dynamic Region Connection Calculus to Exploit Geographic Knowledge in Maritime Surveillance

    Get PDF
    Proceedings of: 15th International Conference on Information Fusion (FUSION 2012), Singapore, 9-12 July 2012.Concerns about the protection of the global transport network have risen the need of new security and surveillance systems. Ontology-based and fusion systems represent an attractive alternative for practical applications focused on fast and accurate responses. This paper presents an architecture based on a geometric model to efficiently predict and calculate the topological relationships between spatial objects. This model aims to reduce the number of calculations by relying on a spatial data structure. The goal is the detection of threatening behaviors next to points of interest without a noticeable loss of efficiency. The architecture has been embedded in an ontology-based prototype compliant with the Joint Directors of Laboratories (JDL) model for Information Fusion. The prototype capabilities are illustrated by applying international protection rules in maritime scenarios.This work was supported in part by Projects CICYT TIN2011-28620-C02-01, CICYT TEC2011-28626-C02-02, CAM CONTEXTS (S2009/TIC-1485) and DPS2008-07029- C02-02.Publicad

    Data and knowledge manangement in field studies A case for semantic technologies

    Get PDF
    Ship design is a knowledge-intensive industry. To design safe ship systems for demanding operations, there is an increasing need for comprehensive knowledge of the operational context. Field studies is an important source of relevant knowledge, but current methods and information systems do not realise their full potential. In this paper, we discuss how the field data can be modelled semantically, integrated with relevant domain models, and be more effectively made available to the organisation. We propose a data model and a software architecture to facilitate the collaborative data analysis and modelling process favoured by designers

    Context-based Information Fusion: A survey and discussion

    Get PDF
    This survey aims to provide a comprehensive status of recent and current research on context-based Information Fusion (IF) systems, tracing back the roots of the original thinking behind the development of the concept of \u201ccontext\u201d. It shows how its fortune in the distributed computing world eventually permeated in the world of IF, discussing the current strategies and techniques, and hinting possible future trends. IF processes can represent context at different levels (structural and physical constraints of the scenario, a priori known operational rules between entities and environment, dynamic relationships modelled to interpret the system output, etc.). In addition to the survey, several novel context exploitation dynamics and architectural aspects peculiar to the fusion domain are presented and discussed

    Classification of Marine Vessels in a Littoral Environment Using a Novel Training Database

    Get PDF
    Research into object classification has led to the creation of hundreds of databases for use as training sets in object classification algorithms. Datasets made up of thousands of cars, people, boats, faces and everyday objects exist for general classification techniques. However, no commercially available database exists for use with detailed classification and categorization of marine vessels commonly found in littoral environments. This research seeks to fill this void and is the combination of a multi-stage research endeavor designed to provide the missing marine vessel ontology. The first of the two stages performed to date introduces a novel training database called the Lister Littoral Database 900 (LLD-900) made up of over 900 high-quality images. These images consist of high-resolution color photos of marine vessels in working, active conditions taken directly from the field and edited for best possible use. Segmentation masks of each boat have been developed to separate the image into foreground and background sections. Segmentation masks that include boat wakes as part of the foreground section are the final image type included. These are included to allow for wake affordance detection algorithms rely on the small changes found in wakes made by different moving vessels. Each of these three types of images are split into their respective general classification folders, which consist of a differing number of boat categories dependent on the research stage. In the first stage of research, the initial database is tested using a simple, readily available classification algorithm known as the Nearest Neighbor Classifier. The accuracy of the database as a training set is tested and recorded and potential improvements are documented. The second stage incorporates these identified improvements and reconfigures the database before retesting the modifications using the same Nearest Neighbor Classifier along with two new methods known as the K-Nearest Neighbor Classifier and the Min-Mean Distance Classifier. These additional algorithms are also readily available and offer basic classification testing using different classification techniques. Improvements in accuracy are calculated and recorded. Finally, further improvements for a possible third iteration are discussed. The goal of this research is to establish the basis for a training database to be used with classification algorithms to increase the security of ports, harbors, shipping channels and bays. The purpose of the database is to train existing and newly created algorithms to properly identify and classify all boats found in littoral areas so that anomalous behavior detection techniques can be applied to determine when a threat is present. This research represents the completion of the initial steps in accomplishing this goal delivering a novel framework for use with littoral area marine vessel classification. The completed work is divided and presented in two separate papers written specifically for submission to and publication at appropriate conferences. When fully integrated with computer vision techniques, the database methodology and ideas presented in this thesis research will help to provide a vital new level of security in the littoral areas around the world

    Case Notes

    Get PDF

    ENHANCED MULTI-LABEL CLASSIFICATION OF HETEROGENEOUS UNDERWATER SOUNDSCAPES BY CONVOLUTIONAL NEURAL NETWORKS USING BAYESIAN DEEP LEARNING

    Get PDF
    The classification of underwater soundscapes is a challenging task for humans as well as machine learning systems. This is largely due to the heterogenous nature of these soundscapes, especially in coastal zones close to human settlements, where multiple ships and other man-made and natural sound sources are often present simultaneously. This thesis proposes a Bayesian deep learning approach that can accurately classify multiple ships simultaneously present in the vicinity of a sensor (multi-label classification) while also providing an uncertainty measurement for the classification. This is achieved by assuming a Bayesian formulation of standard convolutional neural network architectures to not only assign multi-labels per inference but also to provide per inference uncertainty. The best performing Bayesian architecture on the multi-label task achieves a weighted F1 score of 0.84, where each prediction is accompanied by a measurement of uncertainty that is used to further enhance the understanding of model predictions. Ships, submarines, and unmanned underwater vehicles can use this classification system to aid in the identification, tracking, and/or targeting of contacts to help maintain safety of navigation, to aid in the real-time interdiction of illicit activities (such as drug or human smuggling and covert vessel transits), and to provide port security monitoring while uncertainty filters can help sonar operators prioritize contacts for further analysis.Lieutenant Commander, United States NavyApproved for public release; distribution is unlimited

    End-to-end anomaly detection in stream data

    Get PDF
    Nowadays, huge volumes of data are generated with increasing velocity through various systems, applications, and activities. This increases the demand for stream and time series analysis to react to changing conditions in real-time for enhanced efficiency and quality of service delivery as well as upgraded safety and security in private and public sectors. Despite its very rich history, time series anomaly detection is still one of the vital topics in machine learning research and is receiving increasing attention. Identifying hidden patterns and selecting an appropriate model that fits the observed data well and also carries over to unobserved data is not a trivial task. Due to the increasing diversity of data sources and associated stochastic processes, this pivotal data analysis topic is loaded with various challenges like complex latent patterns, concept drift, and overfitting that may mislead the model and cause a high false alarm rate. Handling these challenges leads the advanced anomaly detection methods to develop sophisticated decision logic, which turns them into mysterious and inexplicable black-boxes. Contrary to this trend, end-users expect transparency and verifiability to trust a model and the outcomes it produces. Also, pointing the users to the most anomalous/malicious areas of time series and causal features could save them time, energy, and money. For the mentioned reasons, this thesis is addressing the crucial challenges in an end-to-end pipeline of stream-based anomaly detection through the three essential phases of behavior prediction, inference, and interpretation. The first step is focused on devising a time series model that leads to high average accuracy as well as small error deviation. On this basis, we propose higher-quality anomaly detection and scoring techniques that utilize the related contexts to reclassify the observations and post-pruning the unjustified events. Last but not least, we make the predictive process transparent and verifiable by providing meaningful reasoning behind its generated results based on the understandable concepts by a human. The provided insight can pinpoint the anomalous regions of time series and explain why the current status of a system has been flagged as anomalous. Stream-based anomaly detection research is a principal area of innovation to support our economy, security, and even the safety and health of societies worldwide. We believe our proposed analysis techniques can contribute to building a situational awareness platform and open new perspectives in a variety of domains like cybersecurity, and health
    • 

    corecore