21 research outputs found

    Global Pattern Search at Scale

    Get PDF
    In recent years, data collection has far outpaced the tools for data analysis in the area of non-traditional GEOINT analysis. Traditional tools are designed to analyze small-scale numerical data, but there are few good interactive tools for processing large amounts of unstructured data such as raw text. In addition to the complexities of data processing, presenting the data in a way that is meaningful to the end user poses another challenge. In our work, we focused on analyzing a corpus of 35,000 news articles and creating an interactive geovisualization tool to reveal patterns to human analysts. Our comprehensive tool, Global Pattern Search at Scale (GPSS), addresses three major problems in data analysis: free text analysis, high volumes of data, and interactive visualization. GPSS uses an Accumulo database for high-volume data storage, and a matrix of word counts and event detection algorithms to process the free text. For visualization, the tool displays an interactive web application to the user, featuring a map overlaid with document clusters and events, search and filtering options, a timeline, and a word cloud. In addition, the GPSS tool can be easily adapted to process and understand other large free-text datasets

    Rapid prototyping of radar algorithms [Applications Corner]

    No full text
    Rapid prototyping of advanced signal processing algorithms is critical to developing new radars. Signal processing engineers usually use high level languages like MATLAB, IDL, or Python to develop advanced algorithms and to determine the optimal parameters for these algorithms. Many of these algorithms have very long execution times due to computational complexity and/or very large data sets, which hinders an efficient engineering development workflow. That is, signal processing engineers must wait hours, or even days, to get the results of the current algorithm, parameters, and data set before making changes and refinements for the next iteration. In the meantime, the engineer may have thought of several more permutations that he or she wants to test

    High-productivity software development with pMatlab

    No full text
    In this paper, we explore the ease of tackling a communication-intensive parallel computing task - namely, the 2D fast Fourier transform (FFT). We start with a simple serial Matlab code, explore in detail a ID parallel FFT, and illustrate how it can be extended to multidimensional FFTs. © 2010 IEEEUnited States. Air Force. (Air Force contract FA8721- 05-C-0002

    A Comparison Study of Static Mapping Heuristics for a Class of Meta-tasks on Heterogeneous Computing Systems

    Get PDF
    Heterogeneous computing (HC) environments are well suited to meet the computational demands of large diverse groups of tasks (i. e., a meta- task). The prob lem of mapping (defi ned as matching and scheduling ) these tasks onto the machines of an HC environment has been shown in general to be NP- complete, requir ing the development of heuristic techniques. Selecting the best heuristic to use in a given environment , how ever, remains a di cult problem because comparisons are often clouded by di erent underlying assumptions in the original studies of each heuristic. Therefore, a collection of eleven heuristics from the literature has been selected implemented and analyzed under one set of common assumptions. The eleven heuristics exam ined are Opportunistic Load Balancing, User- Directed Assignment, Fast Greedy, Min min Max- min, Greedy, Genetic Algorithm, Simulated Annealing , Genetic Sim ulated Annealing, Tabu , and A*. This study provides one even basis for comparison and insights into circum stances where one technique will outperform another . The evaluation procedure is speci ed the heuristics are defined and then selected results are compared

    A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems

    Get PDF
    The article of record as published may be located at http://dx.doi.org/10.1006 jpdc.2000.1714Journal of Parallel and Distributed Computing 61, 810 837 (2001)This research was supported in part by the DARPA ITO Quorum Program project called MSHN (management system for heterogeneous networks). MSHN was a collaborative research effort among the Naval Postgraduate School, NOEMIX, Purdue University, and the University of Southern California. One objective of MSHN was to design and evaluate mapping heuristics for different types of HC environments.This research was supported in part by the DARPA ITO Quorum Program under NPS Subcontracts N62271-98-M-0217 and N62271-98-M-0448, and under the GSA Subcontract GS09K99BH0250. Some of the equipment used was donated by Intel and Microsoft

    A Taxonomy for Describing Matching and Scheduling Heuristics for Mixed-Machine Heterogeneous Computing Systems

    No full text
    The problem of mapping (defined as matching and scheduling) tasks and communications onto multiple machines and networks in a heterogeneous computing (HC) environment has been shown to be NP-complete, in general, requiring the development of heuristic techniques. Many different types of mapping heuristics have been developed in recent years. However, selecting the best heuristic to use in any given scenario remains a difficult problem. Factors making this selection difficult are discussed. Motivated by these difficulties, a new taxonomy for classifying mapping heuristics for HC environments is proposed ("the Purdue HC Taxonomy"). The taxonomy is defined in three major parts: (1) the models used for applications and communication requests, (2) the models used for target hardware platforms, and (3) the characteristics of mapping heuristics. Each part of the taxonomy is described, with examples given to help clarify the taxonomy. The benefits and uses of this taxonomy are also discussed. ..

    Characterizing resource allocation heuristics for heterogeneous computing systems

    Get PDF
    Includes bibliographical references (pages 122-128).In many distributed computing environments, collections of applications need to be processed using a set of heterogeneous computing (HC) resources to maximize some performance goal. An important research problem in these environments is how to assign resources to applications (matching) and order the execution of the applications (scheduling) so as to maximize some performance criterion without violating any constraints. This process of matching and scheduling is called mapping. To make meaningful comparisons among mapping heuristics, a system designer needs to understand the assumptions made by the heuristics for (1) the model used for the application and communication tasks, (2) the model used for system platforms, and (3) the attributes of the mapping heuristics. This chapter presents a three-part classification scheme (3PCS) for HC systems. The 3PCS is useful for researchers who want to (a) understand a mapper given in the literature, (b) describe their design of a mapper more thoroughly by using a common standard, and (c) select a mapper to match a given real-world environment

    TabulaROSA: Tabular Operating System Architecture for Massively Parallel Heterogeneous Compute Engines

    No full text
    The rise in computing hardware choices is driving a reevaluation of operating systems. The traditional role of an operating system controlling the execution of its own hardware is evolving toward a model whereby the controlling processor is distinct from the compute engines that are performing most of the computations. In this context, an operating system can be viewed as software that brokers and tracks the resources of the compute engines and is akin to a database management system. To explore the idea of using a database in an operating system role, this work defines key operating system functions in terms of rigorous mathematical semantics (associative array algebra) that are directly translatable into database operations. These operations possess a number of mathematical properties that are ideal for parallel operating systems by guaranteeing correctness over a wide range of parallel operations. The resulting operating system equations provide a mathematical specification for a Tabular Operating System Architecture (TabulaROSA) that can be implemented on any platform. Simulations of forking in TabularROSA are performed using an associative array implementation and compared to Linux on a 32,000+ core supercomputer. Using over 262,000 forkers managing over 68,000,000,000 processes, the simulations show that TabulaROSA has the potential to perform operating system functions on a massively parallel scale. The TabulaROSA simulations show 20x higher performance as compared to Linux while managing 2000x more processes in fully searchable tables.United States. Department of Defense. Assistant Secretary of Defense for Research & Engineering (Air Force Contract No. FA8721-05-C-0002)United States. Department of Defense. Assistant Secretary of Defense for Research & Engineering (Air Force Contract No. FA8702-15-D-0001
    corecore