1,055 research outputs found

    The Unbalanced Classification Problem: Detecting Breaches in Security

    Get PDF
    This research proposes several methods designed to improve solutions for security classification problems. The security classification problem involves unbalanced, high-dimensional, binary classification problems that are prevalent today. The imbalance within this data involves a significant majority of the negative class and a minority positive class. Any system that needs protection from malicious activity, intruders, theft, or other types of breaches in security must address this problem. These breaches in security are considered instances of the positive class. Given numerical data that represent observations or instances which require classification, state of the art machine learning algorithms can be applied. However, the unbalanced and high-dimensional structure of the data must be considered prior to applying these learning methods. High-dimensional data poses a ā€œcurse of dimensionalityā€ which can be overcome through the analysis of subspaces. Exploration of intelligent subspace modeling and the fusion of subspace models is proposed. Detailed analysis of the one-class support vector machine, as well as its weaknesses and proposals to overcome these shortcomings are included. A fundamental method for evaluation of the binary classification model is the receiver operating characteristic (ROC) curve and the area under the curve (AUC). This work details the underlying statistics involved with ROC curves, contributing a comprehensive review of ROC curve construction and analysis techniques to include a novel graphic for illustrating the connection between ROC curves and classifier decision values. The major innovations of this work include synergistic classifier fusion through the analysis of ROC curves and rankings, insight into the statistical behavior of the Gaussian kernel, and novel methods for applying machine learning techniques to defend against computer intrusion detection. The primary empirical vehicle for this research is computer intrusion detection data, and both host-based intrusion detection systems (HIDS) and network-based intrusion detection systems (NIDS) are addressed. Empirical studies also include military tactical scenarios

    Modeling spatial uncertainties in geospatial data fusion and mining

    Get PDF
    Geospatial data analysis relies on Spatial Data Fusion and Mining (SDFM), which heavily depend on topology and geometry of spatial objects. Capturing and representing geometric characteristics such as orientation, shape, proximity, similarity, and their measurement are of the highest interest in SDFM. Representation of uncertain and dynamically changing topological structure of spatial objects including social and communication networks, roads and waterways under the influence of noise, obstacles, temporary loss of communication, and other factors. is another challenge. Spatial distribution of the dynamic network is a complex and dynamic mixture of its topology and geometry. Historically, separation of topology and geometry in mathematics was motivated by the need to separate the invariant part of the spatial distribution (topology) from the less invariant part (geometry). The geometric characteristics such as orientation, shape, and proximity are not invariant. This separation between geometry and topology was done under the assumption that the topological structure is certain and does not change over time. New challenges to deal with the dynamic and uncertain topological structure require a reexamination of this fundamental assumption. In the previous work we proposed a dynamic logic methodology for capturing, representing, and recording uncertain and dynamic topology and geometry jointly for spatial data fusion and mining. This work presents a further elaboration and formalization of this methodology as well as its application for modeling vector-to-vector and raster-to-vector conflation/registration problems and automated feature extraction from the imagery

    MapSnap System to Perform Vector-to-Raster Fusion

    Get PDF
    As the availability of geospatial data increases, there is a growing need to match these datasets together. However, since these datasets often vary in their origins and spatial accuracy, they frequently do not correspond well to each other, which create multiple problems. To accurately align with imagery, analysts currently either: 1) manually move the vectors, 2) perform a labor-intensive spatial registration of vectors to imagery, 3) move imagery to vectors, or 4) redigitize the vectors from scratch and transfer the attributes. All of these are time consuming and labor-intensive operations. Automated matching and fusing vector datasets has been a subject of research for years, and strides are being made. However, much less has been done with matching or fusing vector and raster data. While there are initial forays into this research area, the approaches are not robust. The objective of this work is to design and build robust software called MapSnap to conflate vector and image data in an automated/semi-automated manner. This paper reports the status of the MapSnap project that includes: (i) the overall algorithmic approach and system architecture, (ii) a tiling approach to deal with large datasets to tune MapSnap parameters, (iii) time comparison of MapSnap with re-digitizing the vectors from scratch and transfer the attributes, and (iv) accuracy comparison of MapSnap with manual adjustment of vectors. The paper concludes with the discussion of future work including addressing the general problem of continuous and rapid updating vector data, and fusing vector data with other data

    Self-Organizing Information Fusion and Hierarchical Knowledge Discovery: A New Framework Using Artmap Neural Networks

    Full text link
    Classifying novel terrain or objects from sparse, complex data may require the resolution of conflicting information from sensors woring at different times, locations, and scales, and from sources with different goals and situations. Information fusion methods can help resolve inconsistencies, as when eveidence variously suggests that and object's class is car, truck, or airplane. The methods described her address a complementary problem, supposing that information from sensors and experts is reliable though inconsistent, as when evidence suggests that an object's class is car, vehicle, and man-made. Underlying relationships among classes are assumed to be unknown to the autonomated system or the human user. The ARTMAP information fusion system uses distributed code representations that exploit the neural network's capacity for one-to-many learning in order to produce self-organizing expert systems that discover hierachical knowlege structures. The fusion system infers multi-level relationships among groups of output classes, without any supervised labeling of these relationships. The procedure is illustrated with two image examples, but is not limited to image domain.Air Force Office of Scientific Research (F49620-01-1-0423); National Geospatial-Intelligence Agency (NMA 201-01-1-2016, NMA 501-03-1-2030); National Science Foundation (SBE-0354378, DGE-0221680); Office of Naval Research (N00014-01-1-0624); Department of Homeland Securit

    Automated Vector-to-Raster Image Registration

    Get PDF
    The variability of panchromatic and multispectral images, vector data (maps) and DEM models is growing. Accordingly, the requests and challenges are growing to correlate, match, co-register, and fuse them. Data to be integrated may have inaccurate and contradictory geo-references or not have them at all. Alignment of vector (feature) and raster (image) geospatial data is a difficult and time-consuming process when transformational relationships between the two are nonlinear. The robust solutions and commercial software products that address current challenges do not yet exist. In the proposed approach for Vector-to-Raster Registration (VRR) the candidate features are auto-extracted from imagery, vectorized, and compared against existing vector layer(s) to be registered. Given that available automated feature extraction (AFE) methods quite often produce false features and miss some features, we use additional information to improve AFE. This information is the existing vector data, but the vector data are not perfect as well. To deal with this problem the VRR process uses an algebraic structural algorithm (ASA), similarity transformation of local features algorithm (STLF), and a multi-loop process that repeats (AFE-VRR) process several times. The experiments show that it was successful in registering road vectors to commercial panchromatic and multi-spectral imagery

    Workshop sensing a changing world : proceedings workshop November 19-21, 2008

    Get PDF

    Language-Based Access to Large Sensor Repositories

    Get PDF
    Sensor data have broadened their scope recently, ranging now from the simple time series measurements to, e.g., hyperspectral satellite image maps timeseries. In addition to observed data, simulation data increasingly have to be merged, for example 4-D ocean and atmospheric data. The majority of these data fall into the category of multi-dimensional rasters. However, when it comes to flexible retrieval, including sensor data search, aggregation, analysis, fusion, etc., standard query language support in the past has not kept up with the service level of, e.g., metadata retrieval. To close this gap, the Open GeoSpatial Consortium (OGC) has issued the Web Coverage Processing Service (WCPS) Standard in December 2008. WCPS defines a request language for multi-dimensional raster data, suitable for specifying navigation, download, and analysis of sensor, image, and statistics data. This contribution emphasises sensor data modeling and the perspectives for an integrated, cross-dimensional sensor data retrieval. Further, the WCPS reference implementation is briefly discussed

    AREAL FEATURE MATCHING BASED ON SIMILARITY USING CRITIC METHOD

    Get PDF

    Multispectral Image Road Extraction Based Upon Automated Map Conflation

    Get PDF
    Road network extraction from remotely sensed imagery enables many important and diverse applications such as vehicle tracking, drone navigation, and intelligent transportation studies. There are, however, a number of challenges to road detection from an image. Road pavement material, width, direction, and topology vary across a scene. Complete or partial occlusions caused by nearby buildings, trees, and the shadows cast by them, make maintaining road connectivity difficult. The problems posed by occlusions are exacerbated with the increasing use of oblique imagery from aerial and satellite platforms. Further, common objects such as rooftops and parking lots are made of materials similar or identical to road pavements. This problem of common materials is a classic case of a single land cover material existing for different land use scenarios. This work addresses these problems in road extraction from geo-referenced imagery by leveraging the OpenStreetMap digital road map to guide image-based road extraction. The crowd-sourced cartography has the advantages of worldwide coverage that is constantly updated. The derived road vectors follow only roads and so can serve to guide image-based road extraction with minimal confusion from occlusions and changes in road material. On the other hand, the vector road map has no information on road widths and misalignments between the vector map and the geo-referenced image are small but nonsystematic. Properly correcting misalignment between two geospatial datasets, also known as map conflation, is an essential step. A generic framework requiring minimal human intervention is described for multispectral image road extraction and automatic road map conflation. The approach relies on the road feature generation of a binary mask and a corresponding curvilinear image. A method for generating the binary road mask from the image by applying a spectral measure is presented. The spectral measure, called anisotropy-tunable distance (ATD), differs from conventional measures and is created to account for both changes of spectral direction and spectral magnitude in a unified fashion. The ATD measure is particularly suitable for differentiating urban targets such as roads and building rooftops. The curvilinear image provides estimates of the width and orientation of potential road segments. Road vectors derived from OpenStreetMap are then conflated to image road features by applying junction matching and intermediate point matching, followed by refinement with mean-shift clustering and morphological processing to produce a road mask with piecewise width estimates. The proposed approach is tested on a set of challenging, large, and diverse image data sets and the performance accuracy is assessed. The method is effective for road detection and width estimation of roads, even in challenging scenarios when extensive occlusion occurs

    Semantically-Enabled Sensor Plug & Play for the Sensor Web

    Get PDF
    Environmental sensors have continuously improved by becoming smaller, cheaper, and more intelligent over the past years. As consequence of these technological advancements, sensors are increasingly deployed to monitor our environment. The large variety of available sensor types with often incompatible protocols complicates the integration of sensors into observing systems. The standardized Web service interfaces and data encodings defined within OGCā€™s Sensor Web Enablement (SWE) framework make sensors available over the Web and hide the heterogeneous sensor protocols from applications. So far, the SWE framework does not describe how to integrate sensors on-the-fly with minimal human intervention. The driver software which enables access to sensors has to be implemented and the measured sensor data has to be manually mapped to the SWE models. In this article we introduce a Sensor Plug & Play infrastructure for the Sensor Web by combining (1) semantic matchmaking functionality, (2) a publish/subscribe mechanism underlying the SensorWeb, as well as (3) a model for the declarative description of sensor interfaces which serves as a generic driver mechanism. We implement and evaluate our approach by applying it to an oil spill scenario. The matchmaking is realized using existing ontologies and reasoning engines and provides a strong case for the semantic integration capabilities provided by Semantic Web research
    • ā€¦
    corecore