5,072 research outputs found

    Identifying Web Tables - Supporting a Neglected Type of Content on the Web

    Full text link
    The abundance of the data in the Internet facilitates the improvement of extraction and processing tools. The trend in the open data publishing encourages the adoption of structured formats like CSV and RDF. However, there is still a plethora of unstructured data on the Web which we assume contain semantics. For this reason, we propose an approach to derive semantics from web tables which are still the most popular publishing tool on the Web. The paper also discusses methods and services of unstructured data extraction and processing as well as machine learning techniques to enhance such a workflow. The eventual result is a framework to process, publish and visualize linked open data. The software enables tables extraction from various open data sources in the HTML format and an automatic export to the RDF format making the data linked. The paper also gives the evaluation of machine learning techniques in conjunction with string similarity functions to be applied in a tables recognition task.Comment: 9 pages, 4 figure

    From SpaceStat to CyberGIS: Twenty Years of Spatial Data Analysis Software

    Get PDF
    This essay assesses the evolution of the way in which spatial data analytical methods have been incorporated into software tools over the past two decades. It is part retrospective and prospective, going beyond a historical review to outline some ideas about important factors that drove the software development, such as methodological advances, the open source movement and the advent of the internet and cyberinfrastructure. The review highlights activities carried out by the author and his collaborators and uses SpaceStat, GeoDa, PySAL and recent spatial analytical web services developed at the ASU GeoDa Center as illustrative examples. It outlines a vision for a spatial econometrics workbench as an example of the incorporation of spatial analytical functionality in a cyberGIS.

    A Model-Driven Engineering Approach for ROS using Ontological Semantics

    Full text link
    This paper presents a novel ontology-driven software engineering approach for the development of industrial robotics control software. It introduces the ReApp architecture that synthesizes model-driven engineering with semantic technologies to facilitate the development and reuse of ROS-based components and applications. In ReApp, we show how different ontological classification systems for hardware, software, and capabilities help developers in discovering suitable software components for their tasks and in applying them correctly. The proposed model-driven tooling enables developers to work at higher abstraction levels and fosters automatic code generation. It is underpinned by ontologies to minimize discontinuities in the development workflow, with an integrated development environment presenting a seamless interface to the user. First results show the viability and synergy of the selected approach when searching for or developing software with reuse in mind.Comment: Presented at DSLRob 2015 (arXiv:1601.00877), Stefan Zander, Georg Heppner, Georg Neugschwandtner, Ramez Awad, Marc Essinger and Nadia Ahmed: A Model-Driven Engineering Approach for ROS using Ontological Semantic

    Genomics Virtual Laboratory: a practical bioinformatics workbench for the cloud

    Get PDF
    Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets ; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces ; highly available, scalable computational resources ; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise

    An Autonomic Cross-Platform Operating Environment for On-Demand Internet Computing

    Get PDF
    The Internet has evolved into a global and ubiquitous communication medium interconnecting powerful application servers, diverse desktop computers and mobile notebooks. Along with recent developments in computer technology, such as the convergence of computing and communication devices, the way how people use computers and the Internet has changed people´s working habits and has led to new application scenarios. On the one hand, pervasive computing, ubiquitous computing and nomadic computing become more and more important since different computing devices like PDAs and notebooks may be used concurrently and alternately, e.g. while the user is on the move. On the other hand, the ubiquitous availability and pervasive interconnection of computing systems have fostered various trends towards the dynamic utilization and spontaneous collaboration of available remote computing resources, which are addressed by approaches like utility computing, grid computing, cloud computing and public computing. From a general point of view, the common objective of this development is the use of Internet applications on demand, i.e. applications that are not installed in advance by a platform administrator but are dynamically deployed and run as they are requested by the application user. The heterogeneous and unmanaged nature of the Internet represents a major challenge for the on demand use of custom Internet applications across heterogeneous hardware platforms, operating systems and network environments. Promising remedies are autonomic computing systems that are supposed to maintain themselves without particular user or application intervention. In this thesis, an Autonomic Cross-Platform Operating Environment (ACOE) is presented that supports On Demand Internet Computing (ODIC), such as dynamic application composition and ad hoc execution migration. The approach is based on an integration middleware called crossware that does not replace existing middleware but operates as a self-managing mediator between diverse application requirements and heterogeneous platform configurations. A Java implementation of the Crossware Development Kit (XDK) is presented, followed by the description of the On Demand Internet Computing System (ODIX). The feasibility of the approach is shown by the implementation of an Internet Application Workbench, an Internet Application Factory and an Internet Peer Federation. They illustrate the use of ODIX to support local, remote and distributed ODIC, respectively. Finally, the suitability of the approach is discussed with respect to the support of ODIC

    Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench

    Get PDF
    Challenges in creating comprehensive text-processing worklows include a lack of the interoperability of individual components coming from different providers and/or a requirement imposed on the end users to know programming techniques to compose such workflows. In this paper we demonstrate Argo, a web-based system that addresses these issues in several ways. It supports the widely adopted Unstructured Information Management Architecture (UIMA), which handles the problem of interoperability; it provides a web browser-based interface for developing workflows by drawing diagrams composed of a selection of available processing components; and it provides novel user-interactive analytics such as the annotation editor which constitutes a bridge between automatic processing and manual correction. These features extend the target audience of Argo to users with a limited or no technical background. Here, we focus specifically on the construction of advanced workflows, involving multiple branching and merging points, to facilitate various comparative evalutions. Together with the use of user-collaboration capabilities supported in Argo, we demonstrate several use cases including visual inspections, comparisions of multiple processing segments or complete solutions against a reference standard, inter-annotator agreement, and shared task mass evaluations. Ultimetely, Argo emerges as a one-stop workbench for defining, processing, editing and evaluating text processing tasks

    A Web-Based Distributed Virtual Educational Laboratory

    Get PDF
    Evolution and cost of measurement equipment, continuous training, and distance learning make it difficult to provide a complete set of updated workbenches to every student. For a preliminary familiarization and experimentation with instrumentation and measurement procedures, the use of virtual equipment is often considered more than sufficient from the didactic point of view, while the hands-on approach with real instrumentation and measurement systems still remains necessary to complete and refine the student's practical expertise. Creation and distribution of workbenches in networked computer laboratories therefore becomes attractive and convenient. This paper describes specification and design of a geographically distributed system based on commercially standard components

    A workbench to support development and maintenance of world-wide web

    Get PDF
    The World-Wide Web is one of the most dominant features of the Internet. In its short life it has become an important part of information technology, having a role to play in all sectors. Unfortunately, it has many problems too. Due to its fast evolution, World-Wide Web document development is undisciplined and has resulted in the appearance of much poor quality work. This is also widely due to the inexperience of authors, the lack of conventions, standards or guidelines and useful tools for development and maintenance of Web documents. One solution to the major problems of poor quality of World-Wide Web documents is the improved maintenance of such documents. Maintenance is an important area that, similar to software engineering, receives little attention compared with development. In order to address the problems of World-Wide Web document maintenance, research into the area was carried out through a literature survey and case studies of the organisations that manage World-Wide Web sites. The results of this research led to producing a workbench which provides support to both developers and maintainers of Web documents. This workbench consists of methods, guidelines and tools for World-Wide Web development and maintenance

    Local Type Checking for Linked Data Consumers

    Get PDF
    The Web of Linked Data is the cumulation of over a decade of work by the Web standards community in their effort to make data more Web-like. We provide an introduction to the Web of Linked Data from the perspective of a Web developer that would like to build an application using Linked Data. We identify a weakness in the development stack as being a lack of domain specific scripting languages for designing background processes that consume Linked Data. To address this weakness, we design a scripting language with a simple but appropriate type system. In our proposed architecture some data is consumed from sources outside of the control of the system and some data is held locally. Stronger type assumptions can be made about the local data than external data, hence our type system mixes static and dynamic typing. Throughout, we relate our work to the W3C recommendations that drive Linked Data, so our syntax is accessible to Web developers.Comment: In Proceedings WWV 2013, arXiv:1308.026
    • …
    corecore