Search CORE

28,282 research outputs found

Identifying Web Tables - Supporting a Neglected Type of Content on the Web

Author: A Silva
D Embley
J Hu
M Kolchin
MA Babyak
Y Tijerino
Publication venue
Publication date: 23/03/2015
Field of study

The abundance of the data in the Internet facilitates the improvement of extraction and processing tools. The trend in the open data publishing encourages the adoption of structured formats like CSV and RDF. However, there is still a plethora of unstructured data on the Web which we assume contain semantics. For this reason, we propose an approach to derive semantics from web tables which are still the most popular publishing tool on the Web. The paper also discusses methods and services of unstructured data extraction and processing as well as machine learning techniques to enhance such a workflow. The eventual result is a framework to process, publish and visualize linked open data. The software enables tables extraction from various open data sources in the HTML format and an automatic export to the RDF format making the data linked. The paper also gives the evaluation of machine learning techniques in conjunction with string similarity functions to be applied in a tables recognition task.Comment: 9 pages, 4 figure

arXiv.org e-Print Archive

Crossref

DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams

Author: Ding Jianbing
Fu Tom Z. J.
Ma Richard T. B.
Winslett Marianne
Yang Yin
Zhang Zhenjie
Publication venue
Publication date: 23/04/2015
Field of study

In a data stream management system (DSMS), users register continuous queries, and receive result updates as data arrive and expire. We focus on applications with real-time constraints, in which the user must receive each result update within a given period after the update occurs. To handle fast data, the DSMS is commonly placed on top of a cloud infrastructure. Because stream properties such as arrival rates can fluctuate unpredictably, cloud resources must be dynamically provisioned and scheduled accordingly to ensure real-time response. It is quite essential, for the existing systems or future developments, to possess the ability of scheduling resources dynamically according to the current workload, in order to avoid wasting resources, or failing in delivering correct results on time. Motivated by this, we propose DRS, a novel dynamic resource scheduler for cloud-based DSMSs. DRS overcomes three fundamental challenges: (a) how to model the relationship between the provisioned resources and query response time (b) where to best place resources; and (c) how to measure system load with minimal overhead. In particular, DRS includes an accurate performance model based on the theory of \emph{Jackson open queueing networks} and is capable of handling \emph{arbitrary} operator topologies, possibly with loops, splits and joins. Extensive experiments with real data confirm that DRS achieves real-time response with close to optimal resource consumption.Comment: This is the our latest version with certain modificatio

arXiv.org e-Print Archive

Crossref

Reducing the Barrier to Entry of Complex Robotic Software: a MoveIt! Case Study

Author: Chitta Sachin
Coleman David
Correll Nikolaus
Sucan Ioan
Publication venue
Publication date: 14/04/2014
Field of study

Developing robot agnostic software frameworks involves synthesizing the disparate fields of robotic theory and software engineering while simultaneously accounting for a large variability in hardware designs and control paradigms. As the capabilities of robotic software frameworks increase, the setup difficulty and learning curve for new users also increase. If the entry barriers for configuring and using the software on robots is too high, even the most powerful of frameworks are useless. A growing need exists in robotic software engineering to aid users in getting started with, and customizing, the software framework as necessary for particular robotic applications. In this paper a case study is presented for the best practices found for lowering the barrier of entry in the MoveIt! framework, an open-source tool for mobile manipulation in ROS, that allows users to 1) quickly get basic motion planning functionality with minimal initial setup, 2) automate its configuration and optimization, and 3) easily customize its components. A graphical interface that assists the user in configuring MoveIt! is the cornerstone of our approach, coupled with the use of an existing standardized robot model for input, automatically generated robot-specific configuration files, and a plugin-based architecture for extensibility. These best practices are summarized into a set of barrier to entry design principles applicable to other robotic software. The approaches for lowering the entry barrier are evaluated by usage statistics, a user survey, and compared against our design objectives for their effectiveness to users

arXiv.org e-Print Archive

Going Deeper with Semantics: Video Activity Interpretation using Semantic Contextualization

Author: Aakur Sathyanarayanan N.
de Souza Fillipe DM
Sarkar Sudeep
Publication venue
Publication date: 15/11/2018
Field of study

A deeper understanding of video activities extends beyond recognition of underlying concepts such as actions and objects: constructing deep semantic representations requires reasoning about the semantic relationships among these concepts, often beyond what is directly observed in the data. To this end, we propose an energy minimization framework that leverages large-scale commonsense knowledge bases, such as ConceptNet, to provide contextual cues to establish semantic relationships among entities directly hypothesized from video signal. We mathematically express this using the language of Grenander's canonical pattern generator theory. We show that the use of prior encoded commonsense knowledge alleviate the need for large annotated training datasets and help tackle imbalance in training through prior knowledge. Using three different publicly available datasets - Charades, Microsoft Visual Description Corpus and Breakfast Actions datasets, we show that the proposed model can generate video interpretations whose quality is better than those reported by state-of-the-art approaches, which have substantial training needs. Through extensive experiments, we show that the use of commonsense knowledge from ConceptNet allows the proposed approach to handle various challenges such as training data imbalance, weak features, and complex semantic relationships and visual scenes.Comment: Accepted to WACV 201

arXiv.org e-Print Archive

Crossref

Annotated bibliography of Software Engineering Laboratory literature

Author: Morusiewicz Linda
Valett Jon D.
Publication venue
Publication date
Field of study

An annotated bibliography of technical papers, documents, and memorandums produced by or related to the Software Engineering Laboratory is given. More than 100 publications are summarized. These publications cover many areas of software engineering and range from research reports to software documentation. All materials have been grouped into eight general subject areas for easy reference: The Software Engineering Laboratory; The Software Engineering Laboratory: Software Development Documents; Software Tools; Software Models; Software Measurement; Technology Evaluations; Ada Technology; and Data Collection. Subject and author indexes further classify these documents by specific topic and individual author

NASA Technical Reports Server

An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog

Author: Lane Ian
Liu Bing
Publication venue: 'International Speech Communication Association'
Publication date: 20/08/2017
Field of study

We present a novel end-to-end trainable neural network model for task-oriented dialog systems. The model is able to track dialog state, issue API calls to knowledge base (KB), and incorporate structured KB query results into system responses to successfully complete task-oriented dialogs. The proposed model produces well-structured system responses by jointly learning belief tracking and KB result processing conditioning on the dialog history. We evaluate the model in a restaurant search domain using a dataset that is converted from the second Dialog State Tracking Challenge (DSTC2) corpus. Experiment results show that the proposed model can robustly track dialog state given the dialog history. Moreover, our model demonstrates promising results in producing appropriate system responses, outperforming prior end-to-end trainable neural network models using per-response accuracy evaluation metrics.Comment: Published at Interspeech 201

arXiv.org e-Print Archive

Crossref