29,046 research outputs found
Dynamic Discovery of Type Classes and Relations in Semantic Web Data
The continuing development of Semantic Web technologies and the increasing
user adoption in the recent years have accelerated the progress incorporating
explicit semantics with data on the Web. With the rapidly growing RDF (Resource
Description Framework) data on the Semantic Web, processing large semantic
graph data have become more challenging. Constructing a summary graph structure
from the raw RDF can help obtain semantic type relations and reduce the
computational complexity for graph processing purposes. In this paper, we
addressed the problem of graph summarization in RDF graphs, and we proposed an
approach for building summary graph structures automatically from RDF graph
data. Moreover, we introduced a measure to help discover optimum class
dissimilarity thresholds and an effective method to discover the type classes
automatically. In future work, we plan to investigate further improvement
options on the scalability of the proposed method
Deep Interactive Region Segmentation and Captioning
With recent innovations in dense image captioning, it is now possible to
describe every object of the scene with a caption while objects are determined
by bounding boxes. However, interpretation of such an output is not trivial due
to the existence of many overlapping bounding boxes. Furthermore, in current
captioning frameworks, the user is not able to involve personal preferences to
exclude out of interest areas. In this paper, we propose a novel hybrid deep
learning architecture for interactive region segmentation and captioning where
the user is able to specify an arbitrary region of the image that should be
processed. To this end, a dedicated Fully Convolutional Network (FCN) named
Lyncean FCN (LFCN) is trained using our special training data to isolate the
User Intention Region (UIR) as the output of an efficient segmentation. In
parallel, a dense image captioning model is utilized to provide a wide variety
of captions for that region. Then, the UIR will be explained with the caption
of the best match bounding box. To the best of our knowledge, this is the first
work that provides such a comprehensive output. Our experiments show the
superiority of the proposed approach over state-of-the-art interactive
segmentation methods on several well-known datasets. In addition, replacement
of the bounding boxes with the result of the interactive segmentation leads to
a better understanding of the dense image captioning output as well as accuracy
enhancement for the object detection in terms of Intersection over Union (IoU).Comment: 17, pages, 9 figure
BlogForever D2.6: Data Extraction Methodology
This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
Recommended from our members
Automatic synthesis of analog layout : a survey
A review of recent research in the automatic synthesis of physical geometry for analog integrated circuits is presented. On introduction, an explanation of the difficulties involved in analog layout as opposed to digital layout is covered. Review of the literature then follows. Emphasis is placed on the exposition of general methods for addressing problems specific to analog layout, with the details of specific systems only being given when they surve to illustrate these methods well. The conclusion discusses problems remaining and offers a prediction as to how technology will evolve to solve them. It is argued that although progress has been and will continue to be made in the automation of analog IC layout, due to fundamental differences in the nature of analog IC design as opposed to digital design, it should not be expected that the level of automation of the former will reach that of the latter any time soon
Paraphrase Generation with Deep Reinforcement Learning
Automatic generation of paraphrases from a given sentence is an important yet
challenging task in natural language processing (NLP), and plays a key role in
a number of applications such as question answering, search, and dialogue. In
this paper, we present a deep reinforcement learning approach to paraphrase
generation. Specifically, we propose a new framework for the task, which
consists of a \textit{generator} and an \textit{evaluator}, both of which are
learned from data. The generator, built as a sequence-to-sequence learning
model, can produce paraphrases given a sentence. The evaluator, constructed as
a deep matching model, can judge whether two sentences are paraphrases of each
other. The generator is first trained by deep learning and then further
fine-tuned by reinforcement learning in which the reward is given by the
evaluator. For the learning of the evaluator, we propose two methods based on
supervised learning and inverse reinforcement learning respectively, depending
on the type of available training data. Empirical study shows that the learned
evaluator can guide the generator to produce more accurate paraphrases.
Experimental results demonstrate the proposed models (the generators)
outperform the state-of-the-art methods in paraphrase generation in both
automatic evaluation and human evaluation.Comment: EMNLP 201
- âŠ