2,200 research outputs found

    Lightweight Multilingual Software Analysis

    Full text link
    Developer preferences, language capabilities and the persistence of older languages contribute to the trend that large software codebases are often multilingual, that is, written in more than one computer language. While developers can leverage monolingual software development tools to build software components, companies are faced with the problem of managing the resultant large, multilingual codebases to address issues with security, efficiency, and quality metrics. The key challenge is to address the opaque nature of the language interoperability interface: one language calling procedures in a second (which may call a third, or even back to the first), resulting in a potentially tangled, inefficient and insecure codebase. An architecture is proposed for lightweight static analysis of large multilingual codebases: the MLSA architecture. Its modular and table-oriented structure addresses the open-ended nature of multiple languages and language interoperability APIs. We focus here as an application on the construction of call-graphs that capture both inter-language and intra-language calls. The algorithms for extracting multilingual call-graphs from codebases are presented, and several examples of multilingual software engineering analysis are discussed. The state of the implementation and testing of MLSA is presented, and the implications for future work are discussed.Comment: 15 page

    Enhanced biomedical data extraction from scientific publications

    Get PDF
    The field of scientific research is constantly expanding, with thousands of new articles being published every day. As online databases grow, so does the need for technologies capable of navigating and extracting key information from the stored publications. In the biomedical field, these articles lay the foundation for advancing our understanding of human health and improving medical practices. With such a vast amount of data available, it can be difficult for researchers to quickly and efficiently extract the information they need. The challenge is compounded by the fact that many existing tools are expensive, hard to learn and not compatible with all article types. To address this, a prototype was developed. This prototype leverages the PubMed API to provide researchers access to the information in numerous open access articles. Features include the tracking of keywords and high frequent words along with the possibility of extracting table content. The prototype is designed to streamline the process of extracting data from research articles, allowing researchers to more efficiently analyze and synthesize information from multiple sources.Masteroppgave i informatikkINF399MAMN-INFMAMN-PRO

    Finding Missing American Airmen: Using GIS for Mapping World War II Aircraft Crash Locations in Papua and New Guinea

    Get PDF
    Searching for historical locations, most of the time, requires the search for related information in many documents. The Defense POW/MIA Accounting Agency (DPAA) works to locate War World II (WWII) aircraft wreckage locations in order to find and bring home United States missing in action (MIA) soldiers’ remains. The DPAA has historical documents included tables and maps that provide information about WWII aircraft crash locations in Papua and New Guinea. Use of the documents is limited because they are in a portable document format (PDF) file, which does not allow users to interactively compare the data with other related datasets. Also, working with such documents require more resources and takes more time. Implementing a geographical information system (GIS) was the solution to bring the data into a reliable format that solved the addressed problems. A geodatabase within ArcGIS Desktop was built which allows users to query the data and view them in an interactive map. The new system allows the DPAA to add different datasets and compare them all together in one map. This project also provides analytical maps that may help the DPAA to plan for search and recovery missions as well as map series to be used in the field

    BioBridge: Bringing Data Exploration to Biologists

    Get PDF
    Since the completion of the Human Genome Project in 2003, biologists have become exceptionally good at producing data. Indeed, biological data has experienced a sustained exponential growth rate, putting effective and thorough analysis beyond the reach of many biologists. This thesis presents BioBridge, an interactive visualization tool developed to bring intuitive data exploration to biologists. BioBridge is designed to work on omics style tabular data in general and thus has broad applicability. This work describes the design and evaluation of BioBridge\u27s Entity View primary visualization as well the accompanying user interface. The Entity View visualization arranges glyphs representing biological entities (e.g. genes, proteins, metabolites) along with related text mining results to provide biological context. Throughout development the goal has been to maximize accessibility and usability for biologists who are not computationally inclined. Evaluations were done with three informal case studies, one of a metabolome dataset and two of microarray datasets. BioBridge is a proof of concept that there is an underexploited niche in the data analysis ecosystem for tools that prioritize accessibility and usability. The use case studies, while anecdotal, are very encouraging. These studies indicate that BioBridge is well suited for the task of data exploration. With further development, BioBridge could become more flexible and usable as additional use case datasets are explored and more feedback is gathered

    Automated Development of Semantic Data Models Using Scientific Publications

    Get PDF
    The traditional methods for analyzing information in digital documents have evolved with the ever-increasing volume of data. Some challenges in analyzing scientific publications include the lack of a unified vocabulary and a defined context, different standards and formats in presenting information, various types of data, and diverse areas of knowledge. These challenges hinder detecting, understanding, comparing, sharing, and querying information rapidly. I design a dynamic conceptual data model with common elements in publications from any domain, such as context, metadata, and tables. To enhance the models, I use related definitions contained in ontologies and the Internet. Therefore, this dissertation generates semantically-enriched data models from digital publications based on the Semantic Web principles, which allow people and computers to work cooperatively. Finally, this work uses a vocabulary and ontologies to generate a structured characterization and organize the data models. This organization allows integration, sharing, management, and comparing and contrasting information from publications
    • …
    corecore