24 research outputs found

    The Connectome Viewer Toolkit: An Open Source Framework to Manage, Analyze, and Visualize Connectomes

    Get PDF
    Advanced neuroinformatics tools are required for methods of connectome mapping, analysis, and visualization. The inherent multi-modality of connectome datasets poses new challenges for data organization, integration, and sharing. We have designed and implemented the Connectome Viewer Toolkit – a set of free and extensible open source neuroimaging tools written in Python. The key components of the toolkit are as follows: (1) The Connectome File Format is an XML-based container format to standardize multi-modal data integration and structured metadata annotation. (2) The Connectome File Format Library enables management and sharing of connectome files. (3) The Connectome Viewer is an integrated research and development environment for visualization and analysis of multi-modal connectome data. The Connectome Viewer's plugin architecture supports extensions with network analysis packages and an interactive scripting shell, to enable easy development and community contributions. Integration with tools from the scientific Python community allows the leveraging of numerous existing libraries for powerful connectome data mining, exploration, and comparison. We demonstrate the applicability of the Connectome Viewer Toolkit using Diffusion MRI datasets processed by the Connectome Mapper. The Connectome Viewer Toolkit is available from http://www.cmtk.org

    A Survey on Compiler Autotuning using Machine Learning

    Full text link
    Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

    The Analysis of Open Source Software and Data for Establishment of GIS Services Throughout the Network in a Mapping Organization at National or International Level

    Get PDF
    Federal agencies and their partners collect and manage large amounts of geospatial data but it is often not easily found when needed, and sometimes data is collected or purchased multiple times. In short, the best government data is not always organized and managed efficiently to support decision making in a timely and cost effective manner. National mapping agencies, various Departments responsible for collection of different types of Geospatial data and their authorities cannot, for very long, continue to operate, as they did a few years ago like people living in an island. Leaders need to look at what is now possible that was not possible before, considering capabilities such as cloud computing, crowd sourced data collection, available Open source remotely sensed data and multi source information vital in decision-making as well as new Web-accessible services that provide, sometimes at no cost. Many of these services previously could be obtained only from local GIS experts. These authorities need to consider the available solution and gather information about new capabilities, reconsider agency missions and goals, review and revise policies, make budget and human resource for decisions, and evaluate new products, cloud services, and cloud service providers. To do so, we need, choosing the right tools to rich the above-mentioned goals. As we know, Data collection is the most cost effective part of the mapping and establishment of a Geographic Information system. However, it is not only because of the cost for the data collection task but also because of the damages caused by the delay and the time that takes to provide the user with proper information necessary for making decision from the field up to the user’s hand. In fact, the time consumption of a project for data collection, processing, and presentation of geospatial information has more effect on the cost of a bigger project such as disaster management, construction, city planning, environment, etc. Of course, with such a pre-assumption that we provide all the necessary information from the existing sources directed to user’s computer. The best description for a good GIS project optimization or improvement is finding a methodology to reduce the time and cost, and increase data and service quality (meaning; Accuracy, updateness, completeness, consistency, suitability, information content, integrity, integration capability, and fitness for use as well as user’s specific needs and conditions that must be addressed with a special attention). Every one of the above-mentioned issues must be addressed individually and at the same time, the whole solution must be provided in a global manner considering all the criteria. In this thesis at first, we will discuss about the problem we are facing and what is needed to be done as establishment of National Spatial Data Infra-Structure (NSDI), the definition and related components. Then after, we will be looking for available Open Source Software solutions to cover the whole process to manage; Data collection, Data base management system, data processing and finally data services and presentation. The first distinction among Software is whether they are, Open source and free or commercial and proprietary. It is important to note that in order to make distinction among softwares it is necessary to define a clear specification for this categorization. It is somehow very difficult to distinguish what software belongs to which class from legal point of view and therefore, makes it necessary to clarify what is meant by various terms. With reference to this concept there are 2 global distinctions then, inside each group, we distinguish another classification regarding their functionalities and applications they are made for in GIScience. According to the outcome of the second chapter, which is the technical process for selection of suitable and reliable software according to the characteristics of the users need and required components, we will come to next chapter. In chapter 3, we elaborate in to the details of the GeoNode software as our best candidate tools to take responsibilities of those issues stated before. In Chapter 4, we will discuss the existing Open Source Data globally available with the predefined data quality criteria (Such as theme, data content, scale, licensing, and coverage) according to the metadata statement inside the datasets by mean of bibliographic review, technical documentation and web search engines. We will discuss in chapter 5 further data quality concepts and consequently define sets of protocol for evaluation of all datasets according to the tasks that a mapping organization in general, needed to be responsible to the probable users in different disciplines such as; Reconnaissance, City Planning, Topographic mapping, Transportation, Environment control, disaster management and etc… In Chapter 6, all the data quality assessment and protocols will be implemented into the pre-filtered, proposed datasets. In the final scores and ranking result, each datasets will have a value corresponding to their quality according to the sets of rules that are defined in previous chapter. In last steps, there will be a vector of weight that is derived from the questions that has to be answered by user with reference to the project in hand in order to finalize the most appropriate selection of Free and Open Source Data. This Data quality preference has to be defined by identifying a set of weight vector, and then they have to be applied to the quality matrix in order to get a final quality scores and ranking. At the end of this chapter there will be a section presenting data sets utilization in various projects such as “ Early Impact Analysis” as well as “Extreme Rainfall Detection System (ERDS)- version 2” performed by ITHACA. Finally, in conclusion, the important criteria, as well as future trend in GIS software are discussed and at the end recommendations will be presented

    Žmogiškųjų išteklių informacinio valdymo problemos ir sprendimo ypatumai

    Get PDF
    The current article explores one of the traditional management functional areas of enterprises—human resources management and its multi-component information environments, components. The traditional enterprises, usually manufacturing-oriented enterprises, controlled according to the functions of the activity, when many operating divisions is specialized in carrying out some certain tasks, functions (i.e. every department or unit is focused on the specific information technology applications which are not integrated). But quick changes in the modern activity environment fosters enterprises to switch from the classical functional management approaches (i.e. non-effective databases that are of marginal use, duplicative of one another, and operational systems that cannot adequately provide important information for enterprise control) towards more adaptive, contemporary information processing models, knowledge-based enterprises, process management (i.e. a computer-aided knowledge bases, automatic information exchange, structured and metadata-oriented way). As mentioned above, are the databases now really becoming increasingly unmanageable, non-effective? Slow information processing not only costs money, but also endangers competitiveness and makes users unhappy. However, it should be noted that every functional area, group of users of the enterprise, have their specific, purpose, subjects and management structure, otherwise they have different information needs, requirements. Therefore, organizational information systems need be constantly maintained and applied to their surroundings This article presents and critically analyzes the theoretical, practical aspects of the human resources or employee and information management, i.e. the first introduces 1) the major problems of information management (e.g., data integration and interoperability of systems, why business users often don’t have direct access to the important business data); 2) the process of formation, generation of the business process, business information flows and information structure (information system) and its development; and finally examines 3) the possible changes in the information infrastructure of the human resource development sector—presenting a general framework of an enterprise’s human resource information system, based on the meta-data management model and the usage associated with it (e.g., discovery, extraction, acquisition, distribution). Nowadays, human resources management is being renewed in enterprises and becoming one of the fundamental functions of activity management. Unfortunately, most business and industrial enterprises in the country often lack the capacity to effectively manage (identify, collect, store, manage) its real information resources, and lack the ability to perform systems analysis, modelling, re-building or re-engineering of legacy applications, activity processes. This article presents several relatively simple, practical, but effective techniques (specific adaptations of technologies) that allow an increase in the effectiveness of the information systems; continually improving, reviewing, controlling the existing data in the databases, [...]Tikslas – teoriškai ir praktiškai išnagrinėti bei informacinių technologijų kontekste įvertinti žmogiškųjų išteklių veiklos procesų ir informacijos valdymą, pokyčius. Aptarti informacinės veiklos valdymo ir organizavimo sprendimus, aspektus bei efektyvaus taikymo, siekiant informacinės infrastruktūros gerinimo žmogiškųjų išteklių veiklos procesų vykdymo ir valdymo srityje, galimybes. Pagrįsti veiklos valdymo informacinės sistemos, praplėstos metaduomenų valdymu, naudingumą ir taikomumą. Metodologija – 1) mokslinės literatūros šaltinių analizė – žmogiškųjų išteklių ir informacijos valdymo problemoms, aktualiems klausimams, pažangiems pokyčiams informacinės veiklos valdyme aptarti bei įvertinti; 2) pramoninės įmonės personalo valdymo informacinės sistemos, dokumentų, informacijos šaltinių empirinis tyrimas ir duomenų srautų analizė – siekiant ištirti, atskleisti ir atspindėti esamus šios funkcinės srities informacinių vienetų tarpusavio ryšius ir jų sudėtį, duomenų saugojimo, valdymo problemas bei pagrįsti informacinės veiklos kokybės gerinimo veiksmus; 3) veiklos procesų ir duomenų grafinis vaizdavimas, modeliavimas – realioms žmogiškųjų išteklių veiklos procesų (ŽIVP) valdymo situacijoms, informacinėms sąveikoms ir sistemos būsenoms iliustruoti; centralizuoto ir lankstaus informacijos valdymo, tobulinimo sprendimams, požiūriams, veikimo principams atskleisti bei numatyti metaduomenų taikymo informacinės sistemos (IS) funkcionalumui gerinti galimybes. Rezultatai – tyrimu siekiama atskleisti, kodėl nepakankamai racionaliai naudojamos, prižiūrimos žmogiškųjų išteklių (ŽI) valdymą ir duomenų apdorojimą palaikančios informacinės sistemos ir kokius naujus informacinius reikalavimus kelia šiandienos veiklos aplinka bei sparti kaita, skatinanti ieškoti naujų, efektyvių valdymo formų, prieigos prie duomenų, informacinių objektų galimybių ŽI valdymo srityje. Darbe atlikta detali personalo informacinių išteklių valdymo srities problemų analizė; pateikti, apibrėžti ir iliustruoti paveikslais ŽI informacinės veiklos valdymo gerinimui skirti sprendimai ir modeliai, argumentuota paskirtis bei atskleistas turinys. Galiausiai, pateikta metaduomenų valdymo schema, skirta IS funkcionalumui gerinti. Tyrimo ribotumas – pristatoma nedidelė dalis žmogiškųjų išteklių (atskiro veiklos posistemio) valdymo informacinio aprūpinimo problemų ir pateikiamos tik tam tikros efektyvumo užtikrinimo priemonės, požiūriai, skirti informacinės sistemos darbo bei veiklos tęstinumo kokybei gerinti. Praktinė reikšmė – atlikti teoriniai ir empiriniai darbo tyrimai prisideda prie ŽI valdymo informacinio aprūpinimo supratimo stiprinimo. Tyrimas parodė, kad šiai funkcinei sričiai įmonėse ilgą laiką buvo priskiriamas tik palaikomasis vaidmuo, todėl susikaupė daug problemų. Pristatyti praktiniu požiūriu aktualūs pokyčiai, kurie vyksta ŽI informacijos ir su ja susijusių procesų valdyme, o siekiant darnios įmonės informacinės aplinkos bei gerinti informacinio turinio valdymą – pasitelkti įvairūs sprendimo metodai bei atskleisti praktikams vertingi jų vykdymo klausimai. Originalumas /vertingumas – sklandus ir sėkmingas praktinis informacinių technologijų (IT) infrastruktūros įgyvendinimas, plėtotė ar naujinimas reikalauja ne tik išsamių žinių apie IT produktus, bet ir gebėjimo vertinti, suprasti bei formalizuoti (modeliuoti, algoritmizuoti) nuolat kintančią skaitmeninę informacijos, žinių aplinką; nustatyti veiklos problemas bei planuoti kaitą informacijos posistemėje siekiant tenkinti išskirtines sistemų naudotojų reikmes. T. y. rasti efektyvius būdus, [...

    Protecting Systems From Exploits Using Language-Theoretic Security

    Get PDF
    Any computer program processing input from the user or network must validate the input. Input-handling vulnerabilities occur in programs when the software component responsible for filtering malicious input---the parser---does not perform validation adequately. Consequently, parsers are among the most targeted components since they defend the rest of the program from malicious input. This thesis adopts the Language-Theoretic Security (LangSec) principle to understand what tools and research are needed to prevent exploits that target parsers. LangSec proposes specifying the syntactic structure of the input format as a formal grammar. We then build a recognizer for this formal grammar to validate any input before the rest of the program acts on it. To ensure that these recognizers represent the data format, programmers often rely on parser generators or parser combinators tools to build the parsers. This thesis propels several sub-fields in LangSec by proposing new techniques to find bugs in implementations, novel categorizations of vulnerabilities, and new parsing algorithms and tools to handle practical data formats. To this end, this thesis comprises five parts that tackle various tenets of LangSec. First, I categorize various input-handling vulnerabilities and exploits using two frameworks. First, I use the mismorphisms framework to reason about vulnerabilities. This framework helps us reason about the root causes leading to various vulnerabilities. Next, we built a categorization framework using various LangSec anti-patterns, such as parser differentials and insufficient input validation. Finally, we built a catalog of more than 30 popular vulnerabilities to demonstrate the categorization frameworks. Second, I built parsers for various Internet of Things and power grid network protocols and the iccMAX file format using parser combinator libraries. The parsers I built for power grid protocols were deployed and tested on power grid substation networks as an intrusion detection tool. The parser I built for the iccMAX file format led to several corrections and modifications to the iccMAX specifications and reference implementations. Third, I present SPARTA, a novel tool I built that generates Rust code that type checks Portable Data Format (PDF) files. The type checker I helped build strictly enforces the constraints in the PDF specification to find deviations. Our checker has contributed to at least four significant clarifications and corrections to the PDF 2.0 specification and various open-source PDF tools. In addition to our checker, we also built a practical tool, PDFFixer, to dynamically patch type errors in PDF files. Fourth, I present ParseSmith, a tool to build verified parsers for real-world data formats. Most parsing tools available for data formats are insufficient to handle practical formats or have not been verified for their correctness. I built a verified parsing tool in Dafny that builds on ideas from attribute grammars, data-dependent grammars, and parsing expression grammars to tackle various constructs commonly seen in network formats. I prove that our parsers run in linear time and always terminate for well-formed grammars. Finally, I provide the earliest systematic comparison of various data description languages (DDLs) and their parser generation tools. DDLs are used to describe and parse commonly used data formats, such as image formats. Next, I conducted an expert elicitation qualitative study to derive various metrics that I use to compare the DDLs. I also systematically compare these DDLs based on sample data descriptions available with the DDLs---checking for correctness and resilience

    Data State of Play - Compliance Testing and Interoperability Checking

    Get PDF
    The document provides an inventory of existing solutions for compliance testing and interoperability checking for data taking into account the draft INSPIRE data specifications conceptual model (D2.5), the first draft of the INSPIRE Methodology for the development of data specifications (D2.6) and the first draft of the data Specifications Guidelines for the encoding of spatial data (D2.7). Even if the emphasis is on spatial and geographical data, the document investigates applicable solutions outside the geographical Information System domain, with a particular attention paid to checking compliance with ¿application schemas¿ as defined in the previously mentioned documents.JRC.H.6-Spatial data infrastructure

    New Computational Methods for Automated Large-Scale Archaeological Site Detection

    Get PDF
    Aquesta tesi doctoral presenta una sèrie d'enfocaments, fluxos de treball i models innovadors en el camp de l'arqueologia computacional per a la detecció automatitzada a gran escala de jaciments arqueològics. S'introdueixen nous conceptes, enfocaments i estratègies, com ara lidar multitemporal, aprenentatge automàtic híbrid, refinament, curriculum learning i blob analysis; així com diferents mètodes d'augment de dades aplicats per primera vegada en el camp de l'arqueologia. S'utilitzen múltiples fonts, com ara imatges de satèl·lits multiespectrals, fotografies RGB de plataformes VANT, mapes històrics i diverses combinacions de sensors, dades i fonts. Els mètodes creats durant el desenvolupament d'aquest doctorat s'han avaluat en projectes en curs: Urbanització a Hispània i la Gàl·lia Mediterrània en el primer mil·lenni aC, detecció de monticles funeraris utilitzant algorismes d'aprenentatge automàtic al nord-oest de la Península Ibèrica, prospecció arqueològica intel·ligent basada en drons (DIASur), i cartografiat del patrimoni arqueològic al sud d'Àsia (MAHSA), per a la qual s'han dissenyat fluxos de treball adaptats als reptes específics del projecte. Aquests nous mètodes han aconseguit proporcionar solucions als problemes comuns d'estudis arqueològics presents en estudis similars, com la baixa precisió en detecció i les poques dades d'entrenament. Els mètodes validats i presentats com a part de la tesi doctoral s'han publicat en accés obert amb el codi disponible perquè puguin implementar-se en altres estudis arqueològics.Esta tesis doctoral presenta una serie de enfoques, flujos de trabajo y modelos innovadores en el campo de la arqueología computacional para la detección automatizada a gran escala de yacimientos arqueológicos. Se introducen nuevos conceptos, enfoques y estrategias, como lidar multitemporal, aprendizaje automático híbrido, refinamiento, curriculum learning y blob analysis; así como diferentes métodos de aumento de datos aplicados por primera vez en el campo de la arqueología. Se utilizan múltiples fuentes, como lidar, imágenes satelitales multiespectrales, fotografías RGB de plataformas VANT, mapas históricos y varias combinaciones de sensores, datos y fuentes. Los métodos creados durante el desarrollo de este doctorado han sido evaluados en proyectos en curso: Urbanización en Iberia y la Galia Mediterránea en el Primer Milenio a. C., Detección de túmulos mediante algoritmos de aprendizaje automático en el Noroeste de la Península Ibérica, Prospección Arqueológica Inteligente basada en Drones (DIASur), y cartografiado del Patrimonio del Sur de Asia (MAHSA), para los que se han diseñado flujos de trabajo adaptados a los retos específicos del proyecto. Estos nuevos métodos han logrado proporcionar soluciones a problemas comunes de la prospección arqueológica presentes en estudios similares, como la baja precisión en detección y los pocos datos de entrenamiento. Los métodos validados y presentados como parte de la tesis doctoral se han publicado en acceso abierto con su código disponible para que puedan implementarse en otros estudios arqueológicos.This doctoral thesis presents a series of innovative approaches, workflows and models in the field of computational archaeology for the automated large-scale detection of archaeological sites. New concepts, approaches and strategies are introduced such as multitemporal lidar, hybrid machine learning, refinement, curriculum learning and blob analysis; as well as different data augmentation methods applied for the first time in the field of archaeology. Multiple sources are used, such as lidar, multispectral satellite imagery, RGB photographs from UAV platform, historical maps, and several combinations of sensors, data, and sources. The methods created during the development of this PhD have been evaluated in ongoing projects: Urbanization in Iberia and Mediterranean Gaul in the First Millennium BC, Detection of burial mounds using machine learning algorithms in the Northwest of the Iberian Peninsula, Drone-based Intelligent Archaeological Survey (DIASur), and Mapping Archaeological Heritage in South Asia (MAHSA), for which workflows adapted to the project’ s specific challenges have been designed. These new methods have managed to provide solutions to common archaeological survey problems, presented in similar large-scale site detection studies, such as the low precision in previous detection studies and how to handle problems with few training data. The validated approaches for site detection presented as part of the PhD have been published as open access papers with freely available code so can be implemented in other archaeological studies

    Research and Technology Report. Goddard Space Flight Center

    Get PDF
    This issue of Goddard Space Flight Center's annual report highlights the importance of mission operations and data systems covering mission planning and operations; TDRSS, positioning systems, and orbit determination; ground system and networks, hardware and software; data processing and analysis; and World Wide Web use. The report also includes flight projects, space sciences, Earth system science, and engineering and materials
    corecore